Large#

Description#

The data is collected from the PointMaze_Large-v3 environment. The agent uses a PD controller to follow a path of waypoints generated with QIteration until it reaches the goal. The task is continuing which means that when the agent reaches the goal the environment generates a new random goal without resetting the location of the agent. The reward function is sparse, only returning a value of 1 if the goal is reached, otherwise 0. To add variance to the collected paths random noise is added to the actions taken by the agent.

Dataset Specs#

Total Timesteps

1000000

Total Episodes

3071

Flatten Observations

True

Flatten Actions

False

Algorithm

QIteration

Author

Rodrigo Perez-Vicente

Email

rperezvicente@farama.org

Code Permalink

https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation

download

minari.download_dataset("pointmaze-large-v0")

Environment Specs#

ID

PointMaze_Large-v3

Action Space

Box(-1.0, 1.0, (2,), float32)

Observation Space

Dict('achieved_goal': Box(-inf, inf, (2,), float64), 'desired_goal': Box(-inf, inf, (2,), float64), 'observation': Box(-inf, inf, (4,), float64))