Expert¶

Description¶

Expert data generated by CleanRL PPO Impala. The training code uses standard Atari tricks, such as frame stacking, grayscale, and resizing. However, the dataset observations are not preprocessed. You can find the training code, model, and metrics on CleanRL’s HuggingFace repository.

Dataset Specs¶


Total Steps	6864
Total Episodes	10
Dataset Observation Space	`Box(0, 255, (210, 160, 3), uint8)`
Dataset Action Space	`Discrete(18)`
Algorithm	CleanBA PPO Impala
Author	Omar G. Younis
Email	omar@farama.org
Code Permalink	https://github.com/Farama-Foundation/minari-dataset-generation-scripts
Minari Version	`0.5.3` (supported)
Download	`minari download atari/centipede/expert-v0`

Environment Specs¶

The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec

This environment can be recovered from the Minari dataset as follows:

import minari

dataset = minari.load_dataset('atari/centipede/expert-v0')
env  = dataset.recover_environment()


ID	ALE/Centipede-v5
Observation Space	`Box(0, 255, (210, 160, 3), uint8)`
Action Space	`Discrete(18)`
entry_point	`ale_py.env:AtariEnv`
max_episode_steps	27000
reward_threshold	None
nondeterministic	`False`
order_enforce	`True`
disable_env_checker	`False`
kwargs	`{'game': 'centipede', 'obs_type': 'rgb', 'repeat_action_probability': 0, 'full_action_space': False, 'frameskip': 4, 'max_num_frames_per_episode': 108000}`
additional_wrappers	`()`
vector_entry_point	`None`

Evaluation Environment Specs¶

This dataset doesn’t contain an eval_env_spec attribute which means that the specs of the environment used for evaluation are the same as the specs of the environment used for creating the dataset. The following calls will return the same environment:

import minari

dataset = minari.load_dataset('atari/centipede/expert-v0')
env  = dataset.recover_environment()
eval_env = dataset.recover_environment(eval_env=True)

assert env.spec == eval_env.spec