Large-Dense#

Description#

The data is collected from the PointMaze_LargeDense-v3 environment. The agent uses a PD controller to follow a path of waypoints generated with QIteration until it reaches the goal. The task is continuing which means that when the agent reaches the goal the environment generates a new random goal without resetting the location of the agent. The reward function is dense, being the negative Euclidean distance between the goal and the agent. To add variance to the collected paths random noise is added to the actions taken by the agent.

Dataset Specs#

Total Timesteps

1000000

Total Episodes

3083

Flatten Observations

True

Flatten Actions

False

Algorithm

QIteration

Author

Rodrigo Perez-Vicente

Email

rperezvicente@farama.org

Code Permalink

https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation

download

minari.download_dataset("pointmaze-large-dense-v0")

Environment Specs#

ID

PointMaze_LargeDense-v3

Action Space

Box(-1.0, 1.0, (2,), float32)

Observation Space

Dict('achieved_goal': Box(-inf, inf, (2,), float64), 'desired_goal': Box(-inf, inf, (2,), float64), 'observation': Box(-inf, inf, (4,), float64))