Commit Graph

44 Commits

Author SHA1 Message Date
Alex Pasquali 6bc8e426bf
Removed shared layers in mlp_extractor (#137)
* Removed shared layers in mlp_extractor

* Add ruff

* Update version and add warning

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-01-25 16:28:27 +01:00
Antonin RAFFIN 7bf9cf3f3a
Release v1.7.0 (#134) 2023-01-10 22:35:18 +01:00
Alex Pasquali b5aa9a47ce
Deprecation of shared layers in `mlp_extractor` (#133)
* Deprecation of shared layers in mlp_extractor

* Fix missing import

* Reformat and update tests

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-01-05 10:42:22 +01:00
Quentin Gallouédec 7c4a249fa4
Standardize the use of ``from gym import spaces`` (#131)
* Standardize from gym import spaces

* update changelog

* update issue template

* update version

* Update version
2023-01-02 15:35:00 +01:00
Quentin Gallouédec c9bd045d5c
Add support for python3.10 (#129)
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-12-23 00:54:35 +01:00
Alex Pasquali ab8684f469
[Feature] Non-shared features extractor in on-policy algorithms (#130)
* Modified sb3_contrib/common/maskable/policies.py

- Added support for non-shared features extractor in file sb3_contrib/common/maskable/policies.py
- updated changelog

* Modified sb3_contrib/common/recurrent/policies.py

* Modified sb3_contrib/qrdqn/policies.py and sb3_contrib/tqc/policies.py

* Updated test_cnn.py

* Upgrade SB3 version

* Revert changes in formatting

* Remove duplicate normalize_images

* Add test for image-like inputs

* Fixes and add more tests

* Update SB3 version

* Fix ARS warnings

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-12-23 00:23:45 +01:00
Quentin Gallouédec 6b23c6cfe3
Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122)
* Update contribution.md

* New loop struct to make mypy happy

* Update setup.cfg

* Update changelog

* fix squash_output = False in ARS policy

* Add with_bias parameter to ARSPolicy

* Make ARSLinearPolicy a special case of ARSPolicy

* Remove ars_policy from mypy exclude

* Update changelog

* Update SB3 version

* Fix to save ARS linear policy saved with sb3-contrib < 1.7.0

* Fix test

* Turn docstring into comment

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2022-12-12 13:22:09 +01:00
Antonin RAFFIN c75ad7dd58
Remove deprecated features (#108)
* Remove deprecated features

* Upgrade SB3

* Fix tests
2022-10-11 13:04:18 +02:00
Quentin Gallouédec dec7b5303a
Deprecate ``create_eval_env``, ``eval_env`` and ``eval_freq`` parameter (#105)
* Deprecate ``eval_env``, ``eval_freq```and ``create_eval_env``

* Update changelog

* Typo

* Raise deprecation warining in _setup_learn

* Upgrade to latest SB3 version and update changelog

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-10-10 17:12:40 +02:00
Antonin RAFFIN 2490468b11
Release v1.6.1 (#104) 2022-09-29 12:30:12 +02:00
Quentin Gallouédec 7993b75781
Support `device="auto"`for buffers and set it as default value (#98)
* Default device for buffer is auto

* `device=auto` in ARS

* Undo ARS change

* Update changelog

* Update min SB3 version

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-08-24 09:48:18 +02:00
Antonin RAFFIN 087951d34b
Release v1.6.0 and bug fix for TRPO (#84) 2022-07-12 23:12:24 +02:00
Antonin RAFFIN cd592a111f
Upgrade min SB3 version (#70)
* Upgrade min SB3 version

* Fix for newer sphinx version
2022-05-29 21:54:23 +02:00
Antonin RAFFIN bec00386d1
Upgrade to python 3.7+ syntax (#69)
* Upgrade to python 3.7+ syntax

* Switch to PyTorch 1.11
2022-04-25 13:02:07 +02:00
Grégoire Passault 99853265a9
Using policy_aliases instead of register_policy (#66)
* Using policy_aliases instead of register_policy

* Moving policy_aliases definitions

* Update SB3 version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-04-08 21:36:23 +02:00
Antonin RAFFIN 9d7e33d213
Release v1.5.0 (#64) 2022-03-25 15:04:53 +01:00
Adam Gleave 901a648507
Upgrade Gym to 0.21 (#59)
* Pendulum-v0 -> Pendulum-v1

* Reformat with black

* Update changelog

* Fix dtype bug in TimeFeatureWrapper

* Update version and removed forward calls

* Update CI

* Fix min version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-02-22 16:25:43 +01:00
Antonin RAFFIN 89f2bae9f6
Release 1.4.0 (#57)
* Release 1.4.0

* Update requirements
2022-01-19 13:50:56 +01:00
Sean Gillen 675304d8fa
Augmented Random Search (ARS) (#42)
* first pass at ars, replicates initial results, still needs more testing, cleanup

* add a few docs and tests, bugfixes for ARS

* debug and comment

* break out dump logs

* rollback so there are now predict workers, some refactoring

* remove callback from self, remove torch multiprocessing

* add module docs

* run formatter

* fix load and rerun formatter

* rename to less mathy variable names, rename _validate_hypers

* refactor to use evaluatate_policy, linear policy no longer uses bias or squashing

* move everything to torch, add support for discrete action spaces, bugfix for alive reward offset

* added tests, passing all of them, add support for discrete action spaces

* update documentation

* allow for reward offset when there are multiple envs

* update results again

* Reformat

* Ignore unused imports

* Renaming + Cleanup

* Experimental multiprocessing

* Cleaner multiprocessing

* Reformat

* Fixes for callback

* Fix combining stats

* 2nd way

* Make the implementation cpu only

* Fixes + POC with mp module

* POC Processes

* Cleaner aync implementation

* Remove unused arg

* Add typing

* Revert vec normalize offset hack

* Add `squash_output` parameter

* Add more tests

* Add comments

* Update doc

* Add comments

* Add more logging

* Fix TRPO issue on GPU

* Tmp fix for ARS tests on GPU

* Additional tmp fixes for ARS

* update docstrings + formatting, fix bad exceptioe string in ARSPolicy

* Add comments and docstrings

* Fix missing import

* Fix type check

* Add dosctrings

* GPU support, first attempt

* Fix test

* Add missing docstring

* Typos

* Update defaults hyperparameters

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-01-18 13:57:27 +01:00
Antonin RAFFIN 833669a88b
Drop python 3.6 support (#55)
* Drop python 3.6

* Update setup file
2021-12-06 12:59:53 +01:00
Antonin RAFFIN a1b5ea67ae
Multiprocessing support for off policy algorithms (#50)
* TQC support for multienv

* Add optional layer norm for TQC

* Add layer nprm for all policies

* Revert "Add layer nprm for all policies"

This reverts commit 1306c3c64eb12613464982c66cb416a3bbc66285.

* Revert "Add optional layer norm for TQC"

This reverts commit 200222e3a8878007aa6032d540ae74274a4d0788.

* Add experimental support to train off-policy algorithms with multiple envs

* Bump version

* Update version
2021-12-02 10:40:21 +01:00
Antonin RAFFIN cd0a5e516f
Update citation (#54)
* Update citation

* Fixes for new SB3 version

* Fix type hint

* Additional fixes
2021-12-01 19:09:32 +01:00
Antonin RAFFIN b1397bbb72
Release 1.3.0 (#48) 2021-10-23 17:21:22 +02:00
Geoff McDonald d6c5cea644
MaskablePPO dictionary observation support (#47)
* Add dictionary observation support for ppo_mask.

* Improving naming consistency.

* Update changelog.

* Reformat and add test

* Update doc

* Update README and setup

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2021-10-23 17:05:37 +02:00
Scott Brownlie b2e7126840
Train/Eval Mode Support (#39)
* switch models between train and eval mode

* update changelog

* update release in change log

* Update dependency

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2021-09-08 12:54:50 +02:00
Antonin RAFFIN ae39e00c44
Release v1.1.0 (#34) 2021-07-02 11:38:46 +02:00
Antonin RAFFIN 2258c72215
Update to new logger (#32) 2021-06-14 17:25:08 +02:00
Antonin RAFFIN 08418a3cc8
Bump SB3 version (#30) 2021-05-12 11:46:16 +02:00
Antonin RAFFIN 3665695d1e
Dictionary Observations (#29)
* Add TQC support for new HER version

* Add dict obs support

* Add support for dict obs
2021-05-11 13:24:31 +02:00
Antonin RAFFIN 61bfdbc00a
Fix unused code (#28)
* Fix unused code

* Update changelog

* Update SB3 dependency
2021-05-05 11:42:10 +02:00
Antonin RAFFIN 81ef23d270
SB3 v1.0 (#23) 2021-03-17 14:32:58 +01:00
Antonin RAFFIN 9824daca44
Bug fix for QR-DQN (#21)
* Bug fix for QR-DQN

* Upgrade SB3
2021-03-06 14:54:43 +01:00
Antonin RAFFIN 7c2eb833c0
Upgrade SB3 (#20) 2021-02-27 19:59:21 +01:00
Antonin RAFFIN 74e60381a6
Upgrade Stable-Baselines3 (#19)
* Upgrade Stable-Baselines3

* Fix policy saving/loading
2021-02-27 18:17:22 +01:00
Toshiki Watanabe 4b4d487fdb
Fix the target calculation of QR-DQN (#18)
* Fix the target calculation of QR-DQN

* Update doc

* Update version

* Update changelog

* Update README

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2021-01-11 14:11:16 +01:00
Antonin RAFFIN e9c6135f90 Update setup readme 2020-12-21 11:20:32 +01:00
Antonin RAFFIN 3598ca284a
Update requirements (#15) 2020-12-13 17:29:15 +01:00
Antonin RAFFIN aac20bd1e6 Release v0.10.0 2020-10-28 15:08:07 +01:00
Antonin RAFFIN b896b7492e Update dependencies 2020-10-22 16:35:28 +02:00
Antonin RAFFIN e8093965c7 Fix doc build 2020-10-22 14:46:05 +02:00
Antonin RAFFIN 7609c87e84 Cleanup TQC 2020-10-12 19:50:08 +02:00
Antonin RAFFIN 99fe824f76 Update requirements 2020-09-25 16:00:49 +02:00
Antonin RAFFIN 17c2dabc7f Update CI 2020-09-25 12:50:52 +02:00
Antonin RAFFIN 0d9f2e229e Add TQC and base scripts 2020-09-25 12:47:45 +02:00