stable-baselines3-contrib-sacd

Commit Graph

Author	SHA1	Message	Date
Alex Pasquali	6bc8e426bf	Removed shared layers in mlp_extractor (#137 ) * Removed shared layers in mlp_extractor * Add ruff * Update version and add warning Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-25 16:28:27 +01:00
Antonin RAFFIN	7bf9cf3f3a	Release v1.7.0 (#134 )	2023-01-10 22:35:18 +01:00
Alex Pasquali	b5aa9a47ce	Deprecation of shared layers in `mlp_extractor` (#133 ) * Deprecation of shared layers in mlp_extractor * Fix missing import * Reformat and update tests Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-05 10:42:22 +01:00
Quentin Gallouédec	7c4a249fa4	Standardize the use of ``from gym import spaces`` (#131 ) * Standardize from gym import spaces * update changelog * update issue template * update version * Update version	2023-01-02 15:35:00 +01:00
Quentin Gallouédec	c9bd045d5c	Add support for python3.10 (#129 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-23 00:54:35 +01:00
Alex Pasquali	ab8684f469	[Feature] Non-shared features extractor in on-policy algorithms (#130 ) * Modified sb3_contrib/common/maskable/policies.py - Added support for non-shared features extractor in file sb3_contrib/common/maskable/policies.py - updated changelog * Modified sb3_contrib/common/recurrent/policies.py * Modified sb3_contrib/qrdqn/policies.py and sb3_contrib/tqc/policies.py * Updated test_cnn.py * Upgrade SB3 version * Revert changes in formatting * Remove duplicate normalize_images * Add test for image-like inputs * Fixes and add more tests * Update SB3 version * Fix ARS warnings Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-12-23 00:23:45 +01:00
Quentin Gallouédec	6b23c6cfe3	Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122 ) * Update contribution.md * New loop struct to make mypy happy * Update setup.cfg * Update changelog * fix squash_output = False in ARS policy * Add with_bias parameter to ARSPolicy * Make ARSLinearPolicy a special case of ARSPolicy * Remove ars_policy from mypy exclude * Update changelog * Update SB3 version * Fix to save ARS linear policy saved with sb3-contrib < 1.7.0 * Fix test * Turn docstring into comment Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-12 13:22:09 +01:00
Antonin RAFFIN	c75ad7dd58	Remove deprecated features (#108 ) * Remove deprecated features * Upgrade SB3 * Fix tests	2022-10-11 13:04:18 +02:00
Quentin Gallouédec	dec7b5303a	Deprecate ``create_eval_env``, ``eval_env`` and ``eval_freq`` parameter (#105 ) * Deprecate ``eval_env``, ``eval_freq```and ``create_eval_env`` * Update changelog * Typo * Raise deprecation warining in _setup_learn * Upgrade to latest SB3 version and update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-10-10 17:12:40 +02:00
Antonin RAFFIN	2490468b11	Release v1.6.1 (#104 )	2022-09-29 12:30:12 +02:00
Quentin Gallouédec	7993b75781	Support `device="auto"`for buffers and set it as default value (#98 ) * Default device for buffer is auto * `device=auto` in ARS * Undo ARS change * Update changelog * Update min SB3 version Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-24 09:48:18 +02:00
Antonin RAFFIN	087951d34b	Release v1.6.0 and bug fix for TRPO (#84 )	2022-07-12 23:12:24 +02:00
Antonin RAFFIN	cd592a111f	Upgrade min SB3 version (#70 ) * Upgrade min SB3 version * Fix for newer sphinx version	2022-05-29 21:54:23 +02:00
Antonin RAFFIN	bec00386d1	Upgrade to python 3.7+ syntax (#69 ) * Upgrade to python 3.7+ syntax * Switch to PyTorch 1.11	2022-04-25 13:02:07 +02:00
Grégoire Passault	99853265a9	Using policy_aliases instead of register_policy (#66 ) * Using policy_aliases instead of register_policy * Moving policy_aliases definitions * Update SB3 version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-08 21:36:23 +02:00
Antonin RAFFIN	9d7e33d213	Release v1.5.0 (#64 )	2022-03-25 15:04:53 +01:00
Adam Gleave	901a648507	Upgrade Gym to 0.21 (#59 ) * Pendulum-v0 -> Pendulum-v1 * Reformat with black * Update changelog * Fix dtype bug in TimeFeatureWrapper * Update version and removed forward calls * Update CI * Fix min version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-02-22 16:25:43 +01:00
Antonin RAFFIN	89f2bae9f6	Release 1.4.0 (#57 ) * Release 1.4.0 * Update requirements	2022-01-19 13:50:56 +01:00
Sean Gillen	675304d8fa	Augmented Random Search (ARS) (#42 ) * first pass at ars, replicates initial results, still needs more testing, cleanup * add a few docs and tests, bugfixes for ARS * debug and comment * break out dump logs * rollback so there are now predict workers, some refactoring * remove callback from self, remove torch multiprocessing * add module docs * run formatter * fix load and rerun formatter * rename to less mathy variable names, rename _validate_hypers * refactor to use evaluatate_policy, linear policy no longer uses bias or squashing * move everything to torch, add support for discrete action spaces, bugfix for alive reward offset * added tests, passing all of them, add support for discrete action spaces * update documentation * allow for reward offset when there are multiple envs * update results again * Reformat * Ignore unused imports * Renaming + Cleanup * Experimental multiprocessing * Cleaner multiprocessing * Reformat * Fixes for callback * Fix combining stats * 2nd way * Make the implementation cpu only * Fixes + POC with mp module * POC Processes * Cleaner aync implementation * Remove unused arg * Add typing * Revert vec normalize offset hack * Add `squash_output` parameter * Add more tests * Add comments * Update doc * Add comments * Add more logging * Fix TRPO issue on GPU * Tmp fix for ARS tests on GPU * Additional tmp fixes for ARS * update docstrings + formatting, fix bad exceptioe string in ARSPolicy * Add comments and docstrings * Fix missing import * Fix type check * Add dosctrings * GPU support, first attempt * Fix test * Add missing docstring * Typos * Update defaults hyperparameters Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-01-18 13:57:27 +01:00
Antonin RAFFIN	833669a88b	Drop python 3.6 support (#55 ) * Drop python 3.6 * Update setup file	2021-12-06 12:59:53 +01:00
Antonin RAFFIN	a1b5ea67ae	Multiprocessing support for off policy algorithms (#50 ) * TQC support for multienv * Add optional layer norm for TQC * Add layer nprm for all policies * Revert "Add layer nprm for all policies" This reverts commit 1306c3c64eb12613464982c66cb416a3bbc66285. * Revert "Add optional layer norm for TQC" This reverts commit 200222e3a8878007aa6032d540ae74274a4d0788. * Add experimental support to train off-policy algorithms with multiple envs * Bump version * Update version	2021-12-02 10:40:21 +01:00
Antonin RAFFIN	cd0a5e516f	Update citation (#54 ) * Update citation * Fixes for new SB3 version * Fix type hint * Additional fixes	2021-12-01 19:09:32 +01:00
Antonin RAFFIN	b1397bbb72	Release 1.3.0 (#48 )	2021-10-23 17:21:22 +02:00
Geoff McDonald	d6c5cea644	MaskablePPO dictionary observation support (#47 ) * Add dictionary observation support for ppo_mask. * Improving naming consistency. * Update changelog. * Reformat and add test * Update doc * Update README and setup Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-10-23 17:05:37 +02:00
Scott Brownlie	b2e7126840	Train/Eval Mode Support (#39 ) * switch models between train and eval mode * update changelog * update release in change log * Update dependency Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-09-08 12:54:50 +02:00
Antonin RAFFIN	ae39e00c44	Release v1.1.0 (#34 )	2021-07-02 11:38:46 +02:00
Antonin RAFFIN	2258c72215	Update to new logger (#32 )	2021-06-14 17:25:08 +02:00
Antonin RAFFIN	08418a3cc8	Bump SB3 version (#30 )	2021-05-12 11:46:16 +02:00
Antonin RAFFIN	3665695d1e	Dictionary Observations (#29 ) * Add TQC support for new HER version * Add dict obs support * Add support for dict obs	2021-05-11 13:24:31 +02:00
Antonin RAFFIN	61bfdbc00a	Fix unused code (#28 ) * Fix unused code * Update changelog * Update SB3 dependency	2021-05-05 11:42:10 +02:00
Antonin RAFFIN	81ef23d270	SB3 v1.0 (#23 )	2021-03-17 14:32:58 +01:00
Antonin RAFFIN	9824daca44	Bug fix for QR-DQN (#21 ) * Bug fix for QR-DQN * Upgrade SB3	2021-03-06 14:54:43 +01:00
Antonin RAFFIN	7c2eb833c0	Upgrade SB3 (#20 )	2021-02-27 19:59:21 +01:00
Antonin RAFFIN	74e60381a6	Upgrade Stable-Baselines3 (#19 ) * Upgrade Stable-Baselines3 * Fix policy saving/loading	2021-02-27 18:17:22 +01:00
Toshiki Watanabe	4b4d487fdb	Fix the target calculation of QR-DQN (#18 ) * Fix the target calculation of QR-DQN * Update doc * Update version * Update changelog * Update README Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-01-11 14:11:16 +01:00
Antonin RAFFIN	e9c6135f90	Update setup readme	2020-12-21 11:20:32 +01:00
Antonin RAFFIN	3598ca284a	Update requirements (#15 )	2020-12-13 17:29:15 +01:00
Antonin RAFFIN	aac20bd1e6	Release v0.10.0	2020-10-28 15:08:07 +01:00
Antonin RAFFIN	b896b7492e	Update dependencies	2020-10-22 16:35:28 +02:00
Antonin RAFFIN	e8093965c7	Fix doc build	2020-10-22 14:46:05 +02:00
Antonin RAFFIN	7609c87e84	Cleanup TQC	2020-10-12 19:50:08 +02:00
Antonin RAFFIN	99fe824f76	Update requirements	2020-09-25 16:00:49 +02:00
Antonin RAFFIN	17c2dabc7f	Update CI	2020-09-25 12:50:52 +02:00
Antonin RAFFIN	0d9f2e229e	Add TQC and base scripts	2020-09-25 12:47:45 +02:00

44 Commits