stable-baselines3-contrib-sacd

Commit Graph

Author	SHA1	Message	Date
Armand du Parc Locmaria	cd31e89e26	Fix `train_freq` type annotation for TQC and QR-DQN (#229 ) * fix train_freq type for tqc and qrdn * fix typo * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-05-06 14:20:28 +01:00
Antonin RAFFIN	c965ba9d3b	Remove PyType and upgrade to latest SB3 version (#215 )	2024-05-06 14:20:28 +01:00
Antonin RAFFIN	de92025bb2	Prepare Release v2.0 (#192 )	2023-06-23 13:10:17 +02:00
Antonin RAFFIN	21cc96cafd	Add Gymnasium support (#152 ) * Add support for Gym 0.24 * Fixes for gym 0.24 * Fix for new reset signature * Add tmp SB3 branch * Fixes for gym 0.26 * Remove unused import * Fix dependency * Type annotations fixes * Reformat * Reformat with black 23 * Move to gymnasium * Patch env if needed * Fix types * Fix CI * Fixes for gymnasium * Fix wrapper annotations * Update version * Fix type check * Update QRDQN type hints and bug fix with multi envs * Fix TQC type hints * Fix TRPO type hints * Additional fixes * Update SB3 version * Update issue templates and CI --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-04-14 13:52:07 +02:00
Jonas Reiher	aacded79c5	Add stats window argument (#171 ) * added missing tensorboard_log docstring * added stats_window_size argument to all models * changelog updated * Update SB3 version * fixed passing stats_window_size to parent * added test of stats_window_size --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-05 18:47:27 +02:00
Antonin RAFFIN	728c1c5b7f	Issue forms and pyproject.toml (#162 ) * Issue forms and pyproject.toml * [ci skip] Fix typos * Fix isort config * Use secret link to download atari roms * Fix for mypy and update config * Upgrade SB3 and fix warnings * Fix doc build * Update Makefile * Lint first	2023-03-11 22:57:45 +01:00
Alex Pasquali	376d9551de	Update MaskablePPO docs (#150 ) * MaskablePPO docs Added a warning about possible crashes caused by chack_env in case of invalid actions. * Reformat with black 23 * Rephrase note on action sampling * Fix action noise * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-02-13 14:31:49 +01:00
Quentin Gallouédec	7c4a249fa4	Standardize the use of ``from gym import spaces`` (#131 ) * Standardize from gym import spaces * update changelog * update issue template * update version * Update version	2023-01-02 15:35:00 +01:00
Quentin Gallouédec	9cf8b5076f	Construct tensors directly on GPUs (#128 ) * `to(device)` to `device=device` and `float()` to `dtype=th.float32` * Update changelog * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-23 00:44:25 +01:00
Alex Pasquali	ab8684f469	[Feature] Non-shared features extractor in on-policy algorithms (#130 ) * Modified sb3_contrib/common/maskable/policies.py - Added support for non-shared features extractor in file sb3_contrib/common/maskable/policies.py - updated changelog * Modified sb3_contrib/common/recurrent/policies.py * Modified sb3_contrib/qrdqn/policies.py and sb3_contrib/tqc/policies.py * Updated test_cnn.py * Upgrade SB3 version * Revert changes in formatting * Remove duplicate normalize_images * Add test for image-like inputs * Fixes and add more tests * Update SB3 version * Fix ARS warnings Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-12-23 00:23:45 +01:00
Zikang Xiong	ddb3a1355e	Expose modules in `__init__.py` with `__all__` attribute (#124 ) * expose modules in __init__.py with __all__ attribute * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-12-05 15:53:57 +01:00
Quentin Gallouédec	36aeae18b5	Fix `Self` return type (#116 ) * Self hint for distributions * ClassSelf to SelfClass	2022-11-22 13:12:35 +01:00
Antonin RAFFIN	c75ad7dd58	Remove deprecated features (#108 ) * Remove deprecated features * Upgrade SB3 * Fix tests	2022-10-11 13:04:18 +02:00
Antonin RAFFIN	52795a307e	Add progress bar argument (#107 ) * Add progress bar argument * Sort imports	2022-10-10 18:44:13 +02:00
Quentin Gallouédec	e9c97948c8	Fixed the return type of ``.load()`` methods (#106 ) * Fix return type for learn using TypeVar * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-10 17:21:38 +02:00
Quentin Gallouédec	dec7b5303a	Deprecate ``create_eval_env``, ``eval_env`` and ``eval_freq`` parameter (#105 ) * Deprecate ``eval_env``, ``eval_freq```and ``create_eval_env`` * Update changelog * Typo * Raise deprecation warining in _setup_learn * Upgrade to latest SB3 version and update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-10-10 17:12:40 +02:00
Honglu Fan	cad9034fdb	Handle batch norm in target update (#99 ) * Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996 * Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996 * Copy running stats regardless of tau in QRDQN and TQC. See https://github.com/DLR-RM/stable-baselines3/issues/996 * roll back test_cnn.py	2022-08-27 12:31:00 +02:00
Antonin RAFFIN	db4c0114d0	Update default TQC net arch when using NatureCnn (#79 ) * Update default TQC net arch when using NatureCnn * Bump version	2022-06-18 10:53:29 +02:00
Antonin RAFFIN	bec00386d1	Upgrade to python 3.7+ syntax (#69 ) * Upgrade to python 3.7+ syntax * Switch to PyTorch 1.11	2022-04-25 13:02:07 +02:00
Grégoire Passault	99853265a9	Using policy_aliases instead of register_policy (#66 ) * Using policy_aliases instead of register_policy * Moving policy_aliases definitions * Update SB3 version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-08 21:36:23 +02:00
Adam Gleave	901a648507	Upgrade Gym to 0.21 (#59 ) * Pendulum-v0 -> Pendulum-v1 * Reformat with black * Update changelog * Fix dtype bug in TimeFeatureWrapper * Update version and removed forward calls * Update CI * Fix min version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-02-22 16:25:43 +01:00
Antonin RAFFIN	a1b5ea67ae	Multiprocessing support for off policy algorithms (#50 ) * TQC support for multienv * Add optional layer norm for TQC * Add layer nprm for all policies * Revert "Add layer nprm for all policies" This reverts commit 1306c3c64eb12613464982c66cb416a3bbc66285. * Revert "Add optional layer norm for TQC" This reverts commit 200222e3a8878007aa6032d540ae74274a4d0788. * Add experimental support to train off-policy algorithms with multiple envs * Bump version * Update version	2021-12-02 10:40:21 +01:00
Antonin RAFFIN	91f9b1ed34	Remove sde net arch (#44 )	2021-09-28 21:59:59 +02:00
Scott Brownlie	b2e7126840	Train/Eval Mode Support (#39 ) * switch models between train and eval mode * update changelog * update release in change log * Update dependency Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-09-08 12:54:50 +02:00
Antonin RAFFIN	36eca8ee79	Fix type annotation + add python 3.9 + citation (#37 )	2021-07-29 18:14:03 +02:00
Antonin RAFFIN	2258c72215	Update to new logger (#32 )	2021-06-14 17:25:08 +02:00
Antonin Raffin	30cc206578	Add test for pytorch variables	2021-05-12 11:39:56 +02:00
Antonin RAFFIN	3665695d1e	Dictionary Observations (#29 ) * Add TQC support for new HER version * Add dict obs support * Add support for dict obs	2021-05-11 13:24:31 +02:00
Antonin RAFFIN	61bfdbc00a	Fix unused code (#28 ) * Fix unused code * Update changelog * Update SB3 dependency	2021-05-05 11:42:10 +02:00
Antonin RAFFIN	9824daca44	Bug fix for QR-DQN (#21 ) * Bug fix for QR-DQN * Upgrade SB3	2021-03-06 14:54:43 +01:00
Antonin RAFFIN	74e60381a6	Upgrade Stable-Baselines3 (#19 ) * Upgrade Stable-Baselines3 * Fix policy saving/loading	2021-02-27 18:17:22 +01:00
Toshiki Watanabe	b30397fff5	Add QR-DQN (#13 ) * Add QR-DQN(WIP) * Update docstring * Add quantile_huber_loss * Fix typo * Remove unnecessary lines * Update variable names and comments in quantile_huber_loss * Fix mutable arguments * Update variable names * Ignore import not used warnings * Fix default parameter of optimizer in QR-DQN * Update quantile_huber_loss to have more reasonable interface * update tests * Add assertion to quantile_huber_loss * Update variable names of quantile regression * Update comments * Reduce the number of quantiles during test * Update comment * Update quantile_huber_loss * Fix isort * Add document of QR-DQN without results * Update docs * Fix bugs * Update doc * Add comments about shape * Minor edits * Update comments * Add benchmark * Doc fixes * Update doc * Bug fix in saving/loading + update tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2020-12-21 11:17:48 +01:00
Antonin RAFFIN	eccdc55fdd	Add missing param to docstring	2020-12-08 18:03:56 +01:00
Antonin RAFFIN	857a087a2a	Update TQC to match SB3 (#14 )	2020-12-08 15:35:50 +01:00
Antonin RAFFIN	2ce8d278cc	Fix features extractor issue (#5 ) * Fix feature extractor issue * Sync with SB3 PR	2020-10-27 14:30:35 +01:00
Antonin RAFFIN	0700c3eeb0	Add TQC (#4 ) * Add TQC doc * Polish code * Update doc * Update results * Update doc * Update doc * Add note about PyBullet envs	2020-10-22 13:43:46 +02:00
Antonin RAFFIN	5d7b79d41a	Improve coverage	2020-10-12 20:17:33 +02:00
Antonin RAFFIN	7609c87e84	Cleanup TQC	2020-10-12 19:50:08 +02:00
Antonin RAFFIN	5217a0bd73	Disable n-step replay	2020-09-25 13:18:24 +02:00
Antonin RAFFIN	0d9f2e229e	Add TQC and base scripts	2020-09-25 12:47:45 +02:00

40 Commits