Go to file
Quentin Gallouédec 6b23c6cfe3
Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122)
* Update contribution.md

* New loop struct to make mypy happy

* Update setup.cfg

* Update changelog

* fix squash_output = False in ARS policy

* Add with_bias parameter to ARSPolicy

* Make ARSLinearPolicy a special case of ARSPolicy

* Remove ars_policy from mypy exclude

* Update changelog

* Update SB3 version

* Fix to save ARS linear policy saved with sb3-contrib < 1.7.0

* Fix test

* Turn docstring into comment

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2022-12-12 13:22:09 +01:00
.github Upgrade CI/github-actions (#125) 2022-12-09 12:30:22 +01:00
docs Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122) 2022-12-12 13:22:09 +01:00
sb3_contrib Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122) 2022-12-12 13:22:09 +01:00
scripts Recurrent PPO (#53) 2022-05-30 04:31:12 +02:00
tests Fix reshape LSTM states (#112) 2022-10-26 18:03:45 +02:00
.coveragerc Add TQC and base scripts 2020-09-25 12:47:45 +02:00
.gitignore Mypy type checking (#119) 2022-11-28 23:00:31 +01:00
.readthedocs.yml Fix doc build 2020-10-22 14:46:05 +02:00
CITATION.bib Update citation (#54) 2021-12-01 19:09:32 +01:00
CONTRIBUTING.md Fix `sb3_contrib/common/utils.py` type hint (#120) 2022-11-29 10:24:44 +01:00
LICENSE Initial commit 2020-09-20 22:09:57 +02:00
Makefile Mypy type checking (#119) 2022-11-28 23:00:31 +01:00
README.md Recurrent PPO (#53) 2022-05-30 04:31:12 +02:00
pyproject.toml PPO variant with invalid action masking (#25) 2021-09-23 14:50:10 +02:00
setup.cfg Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122) 2022-12-12 13:22:09 +01:00
setup.py Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122) 2022-12-12 13:22:09 +01:00

README.md

CI codestyle

Stable-Baselines3 - Contrib (SB3-Contrib)

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code. "sb3-contrib" for short.

What is SB3-Contrib?

A place for RL algorithms and tools that are considered experimental, e.g. implementations of the latest publications. Goal is to keep the simplicity, documentation and style of stable-baselines3 but for less matured implementations.

Why create this repository?

Over the span of stable-baselines and stable-baselines3, the community has been eager to contribute in form of better logging utilities, environment wrappers, extended support (e.g. different action spaces) and learning algorithms.

However sometimes these utilities were too niche to be considered for stable-baselines or proved to be too difficult to integrate well into the existing code without creating a mess. sb3-contrib aims to fix this by not requiring the neatest code integration with existing code and not setting limits on what is too niche: almost everything remotely useful goes! We hope this allows us to provide reliable implementations following stable-baselines usual standards (consistent style, documentation, etc) beyond the relatively small scope of utilities in the main repository.

Features

See documentation for the full list of included features.

RL Algorithms:

Gym Wrappers:

Documentation

Documentation is available online: https://sb3-contrib.readthedocs.io/

Installation

To install Stable Baselines3 contrib with pip, execute:

pip install sb3-contrib

We recommend to use the master version of Stable Baselines3.

To install Stable Baselines3 master version:

pip install git+https://github.com/DLR-RM/stable-baselines3

To install Stable Baselines3 contrib master version:

pip install git+https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

How To Contribute

If you want to contribute, please read CONTRIBUTING.md guide first.

Citing the Project

To cite this repository in publications (please cite SB3 directly):

@article{stable-baselines3,
  author  = {Antonin Raffin and Ashley Hill and Adam Gleave and Anssi Kanervisto and Maximilian Ernestus and Noah Dormann},
  title   = {Stable-Baselines3: Reliable Reinforcement Learning Implementations},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {268},
  pages   = {1-8},
  url     = {http://jmlr.org/papers/v22/20-1364.html}
}