Large-Play¶
Description¶
The data is collected from the AntMaze_Large-v4
environment. At the beginning of each episode random locations for the goal and agent’s reset are selected. The success rate of all the trajectories is more than 80%, failed trajectories occur because the Ant flips and can’t stand up again. Also note that when the Ant reaches the goal the episode doesn’t terminate or generate a new target leading to a reward accumulation. The Ant reaches the goals by following a set of waypoints using a goal-reaching policy trained using SAC.
Dataset Specs¶
Total Steps |
1000000 |
Total Episodes |
1000 |
Dataset Observation Space |
|
Dataset Action Space |
|
Algorithm |
QIteration+SAC |
Author |
Alex Davey |
alexdavey0@gmail.com |
|
Code Permalink |
https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation |
Minari Version |
|
Download |
|
Environment Specs¶
The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/antmaze/large-play-v1')
env = dataset.recover_environment()
ID |
AntMaze_Large-v4 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|
Evaluation Environment Specs¶
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/antmaze/large-play-v1')
eval_env = dataset.recover_environment(eval_env=True)
ID |
AntMaze_Large-v4 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|