Commit Graph

31 Commits

Author SHA1 Message Date
Alex Pasquali ab8684f469
[Feature] Non-shared features extractor in on-policy algorithms (#130)
* Modified sb3_contrib/common/maskable/policies.py

- Added support for non-shared features extractor in file sb3_contrib/common/maskable/policies.py
- updated changelog

* Modified sb3_contrib/common/recurrent/policies.py

* Modified sb3_contrib/qrdqn/policies.py and sb3_contrib/tqc/policies.py

* Updated test_cnn.py

* Upgrade SB3 version

* Revert changes in formatting

* Remove duplicate normalize_images

* Add test for image-like inputs

* Fixes and add more tests

* Update SB3 version

* Fix ARS warnings

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-12-23 00:23:45 +01:00
Zikang Xiong ddb3a1355e
Expose modules in `__init__.py` with `__all__` attribute (#124)
* expose modules in __init__.py with __all__ attribute

* Update version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-12-05 15:53:57 +01:00
Quentin Gallouédec 36aeae18b5
Fix `Self` return type (#116)
* Self hint for distributions

* ClassSelf to SelfClass
2022-11-22 13:12:35 +01:00
Antonin RAFFIN c75ad7dd58
Remove deprecated features (#108)
* Remove deprecated features

* Upgrade SB3

* Fix tests
2022-10-11 13:04:18 +02:00
Antonin RAFFIN 52795a307e
Add progress bar argument (#107)
* Add progress bar argument

* Sort imports
2022-10-10 18:44:13 +02:00
Quentin Gallouédec e9c97948c8
Fixed the return type of ``.load()`` methods (#106)
* Fix return type for learn using TypeVar

* Update changelog

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-10-10 17:21:38 +02:00
Quentin Gallouédec dec7b5303a
Deprecate ``create_eval_env``, ``eval_env`` and ``eval_freq`` parameter (#105)
* Deprecate ``eval_env``, ``eval_freq```and ``create_eval_env``

* Update changelog

* Typo

* Raise deprecation warining in _setup_learn

* Upgrade to latest SB3 version and update changelog

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-10-10 17:12:40 +02:00
Honglu Fan cad9034fdb
Handle batch norm in target update (#99)
* Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996

* Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996

* Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996

* roll back test_cnn.py
2022-08-27 12:31:00 +02:00
Antonin RAFFIN db4c0114d0
Update default TQC net arch when using NatureCnn (#79)
* Update default TQC net arch when using NatureCnn

* Bump version
2022-06-18 10:53:29 +02:00
Antonin RAFFIN bec00386d1
Upgrade to python 3.7+ syntax (#69)
* Upgrade to python 3.7+ syntax

* Switch to PyTorch 1.11
2022-04-25 13:02:07 +02:00
Grégoire Passault 99853265a9
Using policy_aliases instead of register_policy (#66)
* Using policy_aliases instead of register_policy

* Moving policy_aliases definitions

* Update SB3 version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-04-08 21:36:23 +02:00
Adam Gleave 901a648507
Upgrade Gym to 0.21 (#59)
* Pendulum-v0 -> Pendulum-v1

* Reformat with black

* Update changelog

* Fix dtype bug in TimeFeatureWrapper

* Update version and removed forward calls

* Update CI

* Fix min version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-02-22 16:25:43 +01:00
Antonin RAFFIN a1b5ea67ae
Multiprocessing support for off policy algorithms (#50)
* TQC support for multienv

* Add optional layer norm for TQC

* Add layer nprm for all policies

* Revert "Add layer nprm for all policies"

This reverts commit 1306c3c64eb12613464982c66cb416a3bbc66285.

* Revert "Add optional layer norm for TQC"

This reverts commit 200222e3a8878007aa6032d540ae74274a4d0788.

* Add experimental support to train off-policy algorithms with multiple envs

* Bump version

* Update version
2021-12-02 10:40:21 +01:00
Antonin RAFFIN 91f9b1ed34
Remove sde net arch (#44) 2021-09-28 21:59:59 +02:00
Scott Brownlie b2e7126840
Train/Eval Mode Support (#39)
* switch models between train and eval mode

* update changelog

* update release in change log

* Update dependency

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2021-09-08 12:54:50 +02:00
Antonin RAFFIN 36eca8ee79
Fix type annotation + add python 3.9 + citation (#37) 2021-07-29 18:14:03 +02:00
Antonin RAFFIN 2258c72215
Update to new logger (#32) 2021-06-14 17:25:08 +02:00
Antonin Raffin 30cc206578 Add test for pytorch variables 2021-05-12 11:39:56 +02:00
Antonin RAFFIN 3665695d1e
Dictionary Observations (#29)
* Add TQC support for new HER version

* Add dict obs support

* Add support for dict obs
2021-05-11 13:24:31 +02:00
Antonin RAFFIN 61bfdbc00a
Fix unused code (#28)
* Fix unused code

* Update changelog

* Update SB3 dependency
2021-05-05 11:42:10 +02:00
Antonin RAFFIN 9824daca44
Bug fix for QR-DQN (#21)
* Bug fix for QR-DQN

* Upgrade SB3
2021-03-06 14:54:43 +01:00
Antonin RAFFIN 74e60381a6
Upgrade Stable-Baselines3 (#19)
* Upgrade Stable-Baselines3

* Fix policy saving/loading
2021-02-27 18:17:22 +01:00
Toshiki Watanabe b30397fff5
Add QR-DQN (#13)
* Add QR-DQN(WIP)

* Update docstring

* Add quantile_huber_loss

* Fix typo

* Remove unnecessary lines

* Update variable names and comments in quantile_huber_loss

* Fix mutable arguments

* Update variable names

* Ignore import not used warnings

* Fix default parameter of optimizer in QR-DQN

* Update quantile_huber_loss to have more reasonable interface

* update tests

* Add assertion to quantile_huber_loss

* Update variable names of quantile regression

* Update comments

* Reduce the number of quantiles during test

* Update comment

* Update quantile_huber_loss

* Fix isort

* Add document of QR-DQN without results

* Update docs

* Fix bugs

* Update doc

* Add comments about shape

* Minor edits

* Update comments

* Add benchmark

* Doc fixes

* Update doc

* Bug fix in saving/loading + update tests

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2020-12-21 11:17:48 +01:00
Antonin RAFFIN eccdc55fdd Add missing param to docstring 2020-12-08 18:03:56 +01:00
Antonin RAFFIN 857a087a2a
Update TQC to match SB3 (#14) 2020-12-08 15:35:50 +01:00
Antonin RAFFIN 2ce8d278cc
Fix features extractor issue (#5)
* Fix feature extractor issue

* Sync with SB3 PR
2020-10-27 14:30:35 +01:00
Antonin RAFFIN 0700c3eeb0
Add TQC (#4)
* Add TQC doc

* Polish code

* Update doc

* Update results

* Update doc

* Update doc

* Add note about PyBullet envs
2020-10-22 13:43:46 +02:00
Antonin RAFFIN 5d7b79d41a Improve coverage 2020-10-12 20:17:33 +02:00
Antonin RAFFIN 7609c87e84 Cleanup TQC 2020-10-12 19:50:08 +02:00
Antonin RAFFIN 5217a0bd73 Disable n-step replay 2020-09-25 13:18:24 +02:00
Antonin RAFFIN 0d9f2e229e Add TQC and base scripts 2020-09-25 12:47:45 +02:00