Expert

Description

Expert data generated by CleanRL PPO Impala. The training code uses standard Atari tricks, such as frame stacking, grayscale, and resizing. However, the dataset observations are not preprocessed. You can find the training code, model, and metrics on CleanRL’s HuggingFace repository.

Dataset Specs

Total Steps

6864

Total Episodes

10

Dataset Observation Space

Box(0, 255, (210, 160, 3), uint8)

Dataset Action Space

Discrete(18)

Algorithm

CleanBA PPO Impala

Author

Omar G. Younis

Email

omar@farama.org

Code Permalink

https://github.com/Farama-Foundation/minari-dataset-generation-scripts

Minari Version

0.5.2 (supported)

Download

minari download atari/centipede/expert-v0

Environment Specs

The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec

This environment can be recovered from the Minari dataset as follows:

import minari

dataset = minari.load_dataset('atari/centipede/expert-v0')
env  = dataset.recover_environment()

ID

ALE/Centipede-v5

Observation Space

Box(0, 255, (210, 160, 3), uint8)

Action Space

Discrete(18)

entry_point

ale_py.env:AtariEnv

max_episode_steps

27000

reward_threshold

None

nondeterministic

False

order_enforce

True

disable_env_checker

False

kwargs

{'game': 'centipede', 'obs_type': 'rgb', 'repeat_action_probability': 0, 'full_action_space': False, 'frameskip': 4, 'max_num_frames_per_episode': 108000}

additional_wrappers

()

vector_entry_point

None

Evaluation Environment Specs

This dataset doesn’t contain an eval_env_spec attribute which means that the specs of the environment used for evaluation are the same as the specs of the environment used for creating the dataset. The following calls will return the same environment:

import minari

dataset = minari.load_dataset('atari/centipede/expert-v0')
env  = dataset.recover_environment()
eval_env = dataset.recover_environment(eval_env=True)

assert env.spec == eval_env.spec