stable-baselines3-contrib-sacd

Commit Graph

Author	SHA1	Message	Date
Antonin RAFFIN	728c1c5b7f	Issue forms and pyproject.toml (#162 ) * Issue forms and pyproject.toml * [ci skip] Fix typos * Fix isort config * Use secret link to download atari roms * Fix for mypy and update config * Upgrade SB3 and fix warnings * Fix doc build * Update Makefile * Lint first	2023-03-11 22:57:45 +01:00
Alex Pasquali	6bc8e426bf	Removed shared layers in mlp_extractor (#137 ) * Removed shared layers in mlp_extractor * Add ruff * Update version and add warning Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-25 16:28:27 +01:00
Alex Pasquali	b5aa9a47ce	Deprecation of shared layers in `mlp_extractor` (#133 ) * Deprecation of shared layers in mlp_extractor * Fix missing import * Reformat and update tests Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-05 10:42:22 +01:00
Quentin Gallouédec	7c4a249fa4	Standardize the use of ``from gym import spaces`` (#131 ) * Standardize from gym import spaces * update changelog * update issue template * update version * Update version	2023-01-02 15:35:00 +01:00
Quentin Gallouédec	9cf8b5076f	Construct tensors directly on GPUs (#128 ) * `to(device)` to `device=device` and `float()` to `dtype=th.float32` * Update changelog * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-23 00:44:25 +01:00
Alex Pasquali	ab8684f469	[Feature] Non-shared features extractor in on-policy algorithms (#130 ) * Modified sb3_contrib/common/maskable/policies.py - Added support for non-shared features extractor in file sb3_contrib/common/maskable/policies.py - updated changelog * Modified sb3_contrib/common/recurrent/policies.py * Modified sb3_contrib/qrdqn/policies.py and sb3_contrib/tqc/policies.py * Updated test_cnn.py * Upgrade SB3 version * Revert changes in formatting * Remove duplicate normalize_images * Add test for image-like inputs * Fixes and add more tests * Update SB3 version * Fix ARS warnings Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-12-23 00:23:45 +01:00
Quentin Gallouédec	b3e4ddd09a	Fix `sb3_contrib/common/recurrent/type_aliases.py` type hint (#121 ) * Update setup.cfg * Update changelog * Update type aliases	2022-11-29 10:41:07 +01:00
Antonin RAFFIN	a9735b9f31	Fix reshape LSTM states (#112 ) * Fix LSTM states reshape * Fix warnings and update changelog * Remove unused variable * Fix runtime error when using n_lstm_layers > 1	2022-10-26 18:03:45 +02:00
Antonin RAFFIN	c75ad7dd58	Remove deprecated features (#108 ) * Remove deprecated features * Upgrade SB3 * Fix tests	2022-10-11 13:04:18 +02:00
Quentin Gallouédec	7993b75781	Support `device="auto"`for buffers and set it as default value (#98 ) * Default device for buffer is auto * `device=auto` in ARS * Undo ARS change * Update changelog * Update min SB3 version Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-24 09:48:18 +02:00
Max Lodel	fc68af8841	Fixed shared_lstm argument in CNN and MultiInput Policies for RecurrentPPO (#90 ) * fixed shared_lstm parameter in CNN and MultiInput Policies * updated tests * changelog * Fix FPS for recurrent PPO * Fix import * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-07-26 00:27:17 +02:00
Quentin Gallouédec	3cbd2429be	Fix returned type in predict (#88 ) * actions[0] -> actions.squeeze(0) * Update changelog * Update changelog * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-18 11:49:03 +02:00
Antonin RAFFIN	75b2de1399	Recurrent PPO (#53 ) * Running (not working yet) version of recurrent PPO * Fixes for multi envs * Save WIP, rework the sampling * Add Box support * Fix sample order * Being cleanup, code is broken (again) * First working version (no shared lstm) * Start cleanup * Try rnn with value function * Re-enable batch size * Deactivate vf rnn * Allow any batch size * Add support for evaluation * Add CNN support * Fix start of sequence * Allow shared LSTM * Rename mask to episode_start * Fix type hint * Enable LSTM for critic * Clean code * Fix for CNN LSTM * Fix sampling with n_layers > 1 * Add std logger * Update wording * Rename and add dict obs support * Fixes for dict obs support * Do not run slow tests * Fix doc * Update recurrent PPO example * Update README * Use Pendulum-v1 for tests * Fix image env * Speedup LSTM forward pass (#63) * added more efficient lstm implementation * Rename and add comment Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org> * Fixes * Remove OpenAI sampling and improve coverage * Sync with SB3 PPO * Pass state shape and allow lstm kwargs * Update tests * Add masking for padded sequences * Update default in perf test * Remove TODO, mask is now working * Add helper to remove duplicated code, remove hack for padding * Enable LSTM critic and raise threshold for cartpole with no vel * Fix tests * Update doc and tests * Doc fix * Fix for new Sphinx version * Fix doc note * Switch to batch first, no more additional swap * Add comments and mask entropy loss Co-authored-by: Neville Walo <43504521+Walon1998@users.noreply.github.com>	2022-05-30 04:31:12 +02:00

13 Commits