Umaze-Dense¶
Description¶
The data is collected from the PointMaze_UMazeDense-v3
environment, which contains a U shape maze. The agent uses a PD controller to follow a path of waypoints generated with QIteration until it reaches the goal. The task is continuing which means that when the agent reaches the goal the environment generates a new random goal without resetting the location of the agent. The reward function is dense, being the exponential negative Euclidean distance between the goal and the agent. To add variance to the collected paths random noise is added to the actions taken by the agent.
Dataset Specs¶
Total Steps |
1000000 |
Total Episodes |
13210 |
Dataset Observation Space |
|
Dataset Action Space |
|
Algorithm |
QIteration |
Author |
Rodrigo Perez-Vicente |
rperezvicente@farama.org |
|
Code Permalink |
https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation |
Minari Version |
|
Download |
|
Environment Specs¶
The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/pointmaze/umaze-dense-v2')
env = dataset.recover_environment()
ID |
PointMaze_UMazeDense-v3 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|
Evaluation Environment Specs¶
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/pointmaze/umaze-dense-v2')
eval_env = dataset.recover_environment(eval_env=True)
ID |
PointMaze_UMazeDense-v3 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|