Skip to content
/ DWBC Public
forked from ryanxhr/DWBC

Author's implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"

License

Notifications You must be signed in to change notification settings

AIR-DI/DWBC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations

This is the code for reproducing the results of the paper Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations accepted at ICML'2022. The paper can be found here.

Usage

Paper results were collected with MuJoCo 1.50 (and mujoco-py 1.50.1.1) in OpenAI gym 0.17.0 with the D4RL datasets. Networks are trained using PyTorch 1.4.0 and Python 3.6.

The paper results can be reproduced by running:

./run_dwbc.sh

You can also run DWBC on the setting used in DemoDICE and SMODICE by running main_setting_demodice.py:

python main_setting_demodice.py \
   --algorithm="DWBC" \  
   --env_e="hopper-expert-v2" \
   --env_o="hopper-random-v2" \
   --num_e=1 \  # expert trajectory num in D_e
   --num_o_e=200 \  # expert trajectory num in D_o
   --num_o_o=2000 \  # non-expert trajectory num in D_o

Bibtex

@inproceedings{xu2022discriminator,
  title     = {Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations},
  author    = {Xu, Haoran and Zhan, Xianyuan and Yin, Honglei and Qin, Huiling},
  booktitle = {Proceedings of the 39th International Conference on Machine Learning},
  pages     = {24725-24742},
  year      = {2022},
}

About

Author's implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.1%
  • Shell 3.9%