Umaze¶
Description¶
The data is collected from the AntMaze_UMaze-v4
environment, which contains a U shape maze. Every episode has the same fixed goal and reset locations. The success rate of all the trajectories is more than 90%, failed trajectories occur because the Ant flips and can’t stand up again. Also note that when the Ant reaches the goal the episode doesn’t terminate or generate a new target leading to a reward accumulation. The Ant reaches the goals by following a set of waypoints using a goal-reaching policy trained using SAC.
Dataset Specs¶
Total Steps |
1000000 |
Total Episodes |
1430 |
Dataset Observation Space |
|
Dataset Action Space |
|
Algorithm |
QIteration+SAC |
Author |
Alex Davey |
alexdavey0@gmail.com |
|
Code Permalink |
https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation |
Minari Version |
|
Download |
|
Environment Specs¶
The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/antmaze/umaze-v1')
env = dataset.recover_environment()
ID |
AntMaze_UMaze-v4 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
700 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|
Evaluation Environment Specs¶
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/antmaze/umaze-v1')
eval_env = dataset.recover_environment(eval_env=True)
ID |
AntMaze_UMaze-v4 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
700 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|