Cloned¶
Description¶
Data obtained by training an imitation policy on the demonstrations from expert
and human
, then running the policy, and mixing data at a 50-50 ratio with the demonstrations. This dataset is provided by D4RL. The environment used to collect the dataset is AdroitHandDoor-v1
.
Dataset Specs¶
Total Steps |
1000000 |
Total Episodes |
4358 |
Dataset Observation Space |
|
Dataset Action Space |
|
Algorithm |
Not provided |
Author |
Rodrigo de Lazcano |
rperezvicente@farama.org |
|
Code Permalink |
https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation |
Minari Version |
|
Download |
|
Environment Specs¶
The following table rows correspond to the Gymnasium environment specifications used to generate the dataset. To read more about what each parameter means you can have a look at the Gymnasium documentation https://gymnasium.farama.org/api/registry/#gymnasium.envs.registration.EnvSpec
This environment can be recovered from the Minari dataset as follows:
import minari
dataset = minari.load_dataset('D4RL/door/cloned-v2')
env = dataset.recover_environment()
ID |
AdroitHandDoor-v1 |
Observation Space |
|
Action Space |
|
entry_point |
|
max_episode_steps |
200 |
reward_threshold |
None |
nondeterministic |
|
order_enforce |
|
disable_env_checker |
|
kwargs |
|
additional_wrappers |
|
vector_entry_point |
|
Evaluation Environment Specs¶
This dataset doesn’t contain an eval_env_spec attribute which means that the specs of the environment used for evaluation are the same as the specs of the environment used for creating the dataset. The following calls will return the same environment:
import minari
dataset = minari.load_dataset('D4RL/door/cloned-v2')
env = dataset.recover_environment()
eval_env = dataset.recover_environment(eval_env=True)
assert env.spec == eval_env.spec