Minari#

Create Minari Dataset#

minari.create_dataset_from_buffers(dataset_id: str, buffer: List[Dict[str, list | Dict]], env: str | gym.Env | EnvSpec | None = None, eval_env: str | gym.Env | EnvSpec | None = None, algorithm_name: str | None = None, author: str | None = None, author_email: str | None = None, code_permalink: str | None = None, minari_version: str | None = None, action_space: gym.spaces.Space | None = None, observation_space: gym.spaces.Space | None = None, ref_min_score: float | None = None, ref_max_score: float | None = None, expert_policy: Callable[[ObsType], ActType] | None = None, num_episodes_average_score: int = 100)[source]#

Create Minari dataset from a list of episode dictionary buffers.

The dataset_id parameter corresponds to the name of the dataset, with the syntax as follows: (env_name-)(dataset_name)(-v(version)) where env_name identifies the name of the environment used to generate the dataset dataset_name. This dataset_id is used to load the Minari datasets with minari.load_dataset().

Each episode dictionary buffer must have the following items:
  • observations: np.ndarray of step observations. shape = (total_episode_steps + 1, (observation_shape)). Should include initial and final observation

  • actions: np.ndarray of step action. shape = (total_episode_steps, (action_shape)).

  • rewards: np.ndarray of step rewards. shape = (total_episode_steps, 1).

  • terminations: np.ndarray of step terminations. shape = (total_episode_steps, 1).

  • truncations: np.ndarray of step truncations. shape = (total_episode_steps, 1).

Other additional items can be added as long as the values are np.ndarray’s or other nested dictionaries.

Parameters:
  • dataset_id (str) – name id to identify Minari dataset.

  • buffer (list[Dict[str, Union[list, Dict]]]) – list of episode dictionaries with data.

  • env (Optional[str|gym.Env|EnvSpec]) – Gymnasium environment(gym.Env)/environment id(str)/environment spec(EnvSpec) used to collect the buffer data. Defaults to None.

  • eval_env (Optional[str|gym.Env|EnvSpec]) – Gymnasium environment(gym.Env)/environment id(str)/environment spec(EnvSpec) to use for evaluation with the dataset. After loading the dataset, the environment can be recovered as follows: MinariDataset.recover_environment(eval_env=True). If None, and if the `env used to collect the buffer data is available, latter will be used for evaluation.

  • algorithm_name (Optional[str], optional) – name of the algorithm used to collect the data. Defaults to None.

  • author (Optional[str], optional) – author that generated the dataset. Defaults to None.

  • author_email (Optional[str], optional) – email of the author that generated the dataset. Defaults to None.

  • code_permalink (Optional[str], optional) – link to relevant code used to generate the dataset. Defaults to None.

  • ref_min_score (Optional[float], optional) – minimum reference score from the average returns of a random policy. This value is later used to normalize a score with minari.get_normalized_score(). If default None the value will be estimated with a default random policy. Also note that this attribute will be added to the Minari dataset only if ref_max_score or expert_policy are assigned a valid value other than None.

  • (Optional[float] (ref_max_score) – maximum reference score from the average returns of a hypothetical expert policy. This value is used in MinariDataset.get_normalized_score(). Default None.

  • optional – maximum reference score from the average returns of a hypothetical expert policy. This value is used in MinariDataset.get_normalized_score(). Default None.

  • expert_policy (Optional[Callable[[ObsType], ActType], optional) – policy to compute ref_max_score by averaging the returns over a number of episodes equal to num_episodes_average_score. ref_max_score and expert_policy can’t be passed at the same time. Default to None

  • num_episodes_average_score (int) – number of episodes to average over the returns to compute ref_min_score and ref_max_score. Default to 100. observation_space:

  • action_space (Optional[gym.spaces.Space]) – action space of the environment. If None (default) use the environment action space.

  • observation_space (Optional[gym.spaces.Space]) – observation space of the environment. If None (default) use the environment observation space.

  • minari_version (Optional[str], optional) – Minari version specifier compatible with the dataset. If None (default) use the installed Minari version.

Returns:

MinariDataset

Load Minari Dataset#

minari.load_dataset(dataset_id: str, download: bool = False)[source]#

Retrieve Minari dataset from local database.

Parameters:
  • dataset_id (str) – name id of Minari dataset

  • download (bool) – if True download the dataset if it is not found locally. Default to False.

Returns:

MinariDataset

Split Minari Dataset#

minari.split_dataset(dataset: MinariDataset, sizes: List[int], seed: int | None = None) List[MinariDataset][source]#

Split a MinariDataset in multiple datasets.

Parameters:
  • dataset (MinariDataset) – the MinariDataset to split

  • sizes (List[int]) – sizes of the resulting datasets

  • seed (Optional[int]) – random seed

Returns:

datasets (List[MinariDataset]) – resulting list of datasets

Download Minari Dataset#

minari.download_dataset(dataset_id: str, force_download: bool = False)[source]#

Download dataset from remote Farama server.

An error will be raised if the dataset version is not compatible with the local installed version of Minari. This error can be skipped and the download continued with the force_download argument. Also, with force_download, any local datasets that match the id of the downloading dataset will be overridden.

Parameters:
  • dataset_id (str) – name id of the Minari dataset

  • force_download (bool) – boolean flag for force downloading the dataset. Default Value = False

List Minari Datasets#

minari.list_local_datasets(latest_version: bool = False, compatible_minari_version: bool = False) Dict[str, Dict[str, str | int | bool]][source]#

Get the ids and metadata of all the Minari datasets in the local database.

Parameters:
  • latest_version (bool) – if True only the latest version of the datasets are returned i.e. from [‘door-human-v0’, ‘door-human-v1`], only the metadata for v1 is returned. Default to False.

  • compatible_minari_version (bool) – if True only the datasets compatible with the current Minari version are returned. Default to False.

Returns:

Dict[str, Dict[str, str]] – keys the names of the Minari datasets and values the metadata

minari.list_remote_datasets(latest_version: bool = False, compatible_minari_version: bool = False) Dict[str, Dict[str, str]][source]#

Get the names and metadata of all the Minari datasets in the remote Farama server.

Parameters:
  • latest_version (bool) – if True only the latest version of the datasets are returned i.e. from [‘door-human-v0’, ‘door-human-v1`], only the metadata for v1 is returned. Default to False.

  • compatible_minari_version (bool) – if True only the datasets compatible with the current Minari version are returned. Default to False.

Returns:

Dict[str, Dict[str, str]] – keys the names of the Minari datasets and values the metadata

Delete Minari Datasets#

minari.delete_dataset(dataset_id: str)[source]#

Delete a Minari dataset from the local Minari database.

Parameters:

dataset_id (str) – name id of the Minari dataset

Combine Minari Datasets#

minari.combine_datasets(datasets_to_combine: List[MinariDataset], new_dataset_id: str)[source]#

Combine a group of MinariDataset in to a single dataset with its own name id.

The new dataset will contain a metadata attribute combined_datasets containing a list with the dataset names that were combined to form this new Minari dataset.

Parameters:
  • datasets_to_combine (list[MinariDataset]) – list of datasets to be combined

  • new_dataset_id (str) – name id for the newly created dataset

Returns:

combined_dataset (MinariDataset) – the resulting MinariDataset

Normalize Score#

minari.get_normalized_score(dataset: MinariDataset, returns: ndarray) ndarray[source]#

Normalize undiscounted return of an episode.

This function was originally provided in the D4RL repository. The computed normalized episode return (normalized score) facilitates the comparison of algorithm performance across different tasks. The returned normalized score will be in a range between 0 and 1. Where 0 corresponds to the minimum reference score calculated as the average of episode returns collected from a random policy in the environment, and 1 corresponds to a maximum reference score computed as the average of episode returns from an hypothetical expert policy. These two values are stored as optional attributes in a MinariDataset as ref_min_score and ref_max_score respectively.

The formula to normalize an episode return is:

\[normalize\_score = \frac{return - ref\_min\_score}{ref\_max\_score - ref\_min\_score}\]

Warning

This utility function is under testing and will not be available in every Minari dataset. For now, only the datasets imported from D4RL will contain the ref_min_score and ref_max_score attributes.

Parameters:
  • dataset (MinariDataset) – the MinariDataset with respect to which normalize the score. Must contain the reference score attributes ref_min_score and ref_max_score.

  • returns (np.ndarray) – a single value or array of episode undiscounted returns to normalize.

Returns:

normalized_scores