Medium¶
Description¶
The data is collected from the PointMaze_Medium-v3
environment. The agent uses a PD controller to follow a path of waypoints generated with QIteration until it reaches the goal. The task is continuing which means that when the agent reaches the goal the environment generates a new random goal without resetting the location of the agent. The reward function is sparse, only returning a value of 1 if the goal is reached, otherwise 0. To add variance to the collected paths random noise is added to the actions taken by the agent.
Dataset Specs¶
Total Steps |
1000000 |
Total Episodes |
4752 |
Dataset Observation Space |
|
Dataset Action Space |
|
Algorithm |
QIteration |
Author |
Rodrigo Perez-Vicente |
rperezvicente@farama.org |
|
Code Permalink |
https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation |
Minari Version |
|
Download |
|
Environment Specs¶
The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/pointmaze/medium-v2')
env = dataset.recover_environment()
ID |
PointMaze_Medium-v3 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|
Evaluation Environment Specs¶
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/pointmaze/medium-v2')
eval_env = dataset.recover_environment(eval_env=True)
ID |
PointMaze_Medium-v3 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
1000000 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|