Commit Graph

  • adfbeb1b01 Fix typo in changelog (#237) master Antonin RAFFIN 2024-04-01 16:07:19 +0200
  • 17cd797d5c Release v2.3.0 (#236) Antonin RAFFIN 2024-03-31 20:39:46 +0200
  • 34eceaf382 Log success rate for PPO variants (#235) Antonin RAFFIN 2024-03-31 19:51:48 +0200
  • 89d0113037 Update ruff and SB3 dependencies (#232) Antonin RAFFIN 2024-03-11 14:29:47 +0100
  • 7dd6c39fba Fix MaskablePPO type annotations (#233) Antonin RAFFIN 2024-03-11 14:10:12 +0100
  • cd31e89e26 Fix `train_freq` type annotation for TQC and QR-DQN (#229) Armand du Parc Locmaria 2024-01-24 10:44:38 +0100
  • bc3c0a9595 Add notes about MaskablePPO (#227) Tjeerd Bakker 2024-01-18 17:03:02 +0100
  • 3f0c5088b3 Update QRDQN defaults (#225) Antonin RAFFIN 2024-01-12 16:17:44 +0100
  • 1553b66ee4 Update `_process_sequence()` docstring (#219) Rogério Júnior 2023-12-05 04:59:48 -0800
  • 94a5daab02 Update SB3 version (#217) Antonin RAFFIN 2023-11-18 00:02:47 +0100
  • ebb74c44ec Release v2.2.0 (#216) Antonin RAFFIN 2023-11-16 18:06:50 +0100
  • 6cfa588c0f Update pytorch version Antonin Raffin 2023-11-16 18:00:31 +0100
  • c965ba9d3b Remove PyType and upgrade to latest SB3 version (#215) Antonin RAFFIN 2023-11-08 09:50:06 +0100
  • 5e437fc4dc Add rollout_buffer_class to TRPO (#214) M. Ernestus 2023-10-30 16:10:41 +0100
  • 4d7ed004af Sync SB3 Contrib with SB3 (#213) Antonin RAFFIN 2023-10-25 14:32:51 +0200
  • 5be11deaf3 Fix RTD default theme Antonin Raffin 2023-09-07 09:53:40 +0200
  • cf4ed5fe65 Release v2.1.0 (#204) Antonin RAFFIN 2023-08-17 22:17:12 +0200
  • 4e99b74e90
    Merge branch 'Stable-Baselines-Team:master' into master Paul Auerbach 2023-08-07 14:34:38 +0200
  • bc08ee985e Added save_load test for SACD Paul Auerbach 2023-08-07 14:23:07 +0200
  • d97dbc727c Added doc page for SACD Paul Auerbach 2023-08-07 14:03:12 +0200
  • 610fd3dcf6 Added run test for SACD Paul Auerbach 2023-08-07 13:16:00 +0200
  • fca2c6d490 Prepared files for merge request (minor cleanup) Paul Auerbach 2023-08-04 18:19:34 +0200
  • 4a37f58259 Code style changes Paul Auerbach 2023-08-02 13:52:40 +0200
  • 7711813dbb Reworked code to work whith more than 2 critic networks Paul Auerbach 2023-08-02 13:17:22 +0200
  • 875b8bca0d Fixed bugs in that lead to wrong results, currently only working with 2 critics Paul Auerbach 2023-08-01 15:09:55 +0200
  • dfa23bdf9c
    Bugfix/ppo mask stats window size (#199) PatrickHelm 2023-08-01 11:53:27 +0200
  • a14ae69b6b Added first version of SAC Discrete, which is running but not learning currently Paul Auerbach 2023-07-31 16:07:08 +0200
  • 35f06254ba
    Drop python 3.7, add 3.11 and update github templates (#194) Antonin RAFFIN 2023-07-03 12:45:20 +0200
  • de92025bb2
    Prepare Release v2.0 (#192) Antonin RAFFIN 2023-06-23 13:10:17 +0200
  • 6e1aba45e3
    Update version and fix #188 (#190) Antonin RAFFIN 2023-06-07 16:51:42 +0200
  • c84079d4f3 Update pypi build command Antonin Raffin 2023-05-24 11:41:28 +0200
  • d467d7a844
    Update AsyncEval seeding (#185) Antonin RAFFIN 2023-05-20 10:57:31 +0200
  • 86fb056fda
    Update doc: switch from Gym to Gymnasium (#182) Antonin RAFFIN 2023-05-10 11:40:40 +0200
  • 21cc96cafd
    Add Gymnasium support (#152) Antonin RAFFIN 2023-04-14 13:52:07 +0200
  • a84ad3aa7d
    Release v1.8.0 (#173) Antonin RAFFIN 2023-04-08 15:50:04 +0200
  • aacded79c5
    Add stats window argument (#171) Jonas Reiher 2023-04-05 18:47:27 +0200
  • ce115982aa
    Fix QR-DQN type hints (#170) Antonin RAFFIN 2023-03-30 11:50:26 +0200
  • b5fd6e65ba
    Update SB3 and config (#167) Antonin RAFFIN 2023-03-20 12:35:37 +0100
  • 1f9568b2da
    Fix Atari Roms Download (#164) Antonin RAFFIN 2023-03-12 19:06:23 +0100
  • 728c1c5b7f
    Issue forms and pyproject.toml (#162) Antonin RAFFIN 2023-03-11 22:57:45 +0100
  • 376d9551de
    Update MaskablePPO docs (#150) Alex Pasquali 2023-02-13 14:31:49 +0100
  • 6bc8e426bf
    Removed shared layers in mlp_extractor (#137) Alex Pasquali 2023-01-25 16:28:27 +0100
  • 1d0edd2dab
    Move pytype to pyproject.toml Antonin Raffin 2023-01-10 22:55:12 +0100
  • 7bf9cf3f3a
    Release v1.7.0 (#134) Antonin RAFFIN 2023-01-10 22:35:18 +0100
  • b5aa9a47ce
    Deprecation of shared layers in `mlp_extractor` (#133) Alex Pasquali 2023-01-05 10:42:22 +0100
  • 7c4a249fa4
    Standardize the use of ``from gym import spaces`` (#131) Quentin Gallouédec 2023-01-02 15:35:00 +0100
  • c9bd045d5c
    Add support for python3.10 (#129) Quentin Gallouédec 2022-12-23 00:54:35 +0100
  • 9cf8b5076f
    Construct tensors directly on GPUs (#128) Quentin Gallouédec 2022-12-23 00:44:25 +0100
  • ab8684f469
    [Feature] Non-shared features extractor in on-policy algorithms (#130) Alex Pasquali 2022-12-23 00:23:45 +0100
  • 6b23c6cfe3
    Add `with_bias` parameter to `ARSPolicy` and fix `sb3_contrib/ars/policies.py` type hint (#122) Quentin Gallouédec 2022-12-12 13:22:09 +0100
  • 9a728513da
    Upgrade CI/github-actions (#125) Quentin Gallouédec 2022-12-09 12:30:22 +0100
  • ddb3a1355e
    Expose modules in `__init__.py` with `__all__` attribute (#124) Zikang Xiong 2022-12-05 09:53:57 -0500
  • b3e4ddd09a
    Fix `sb3_contrib/common/recurrent/type_aliases.py` type hint (#121) Quentin Gallouédec 2022-11-29 10:41:07 +0100
  • ded9f65bfd
    Fix `sb3_contrib/common/utils.py` type hint (#120) Quentin Gallouédec 2022-11-29 10:24:44 +0100
  • 3d28d1e5de
    Mypy type checking (#119) Quentin Gallouédec 2022-11-28 23:00:31 +0100
  • 703fd2dd68
    Fix for new flake8 version Antonin Raffin 2022-11-25 18:51:34 +0100
  • 36aeae18b5
    Fix `Self` return type (#116) Quentin Gallouédec 2022-11-22 13:12:35 +0100
  • a9735b9f31
    Fix reshape LSTM states (#112) Antonin RAFFIN 2022-10-26 18:03:45 +0200
  • c75ad7dd58
    Remove deprecated features (#108) Antonin RAFFIN 2022-10-11 13:04:18 +0200
  • 52795a307e
    Add progress bar argument (#107) Antonin RAFFIN 2022-10-10 18:44:13 +0200
  • e9c97948c8
    Fixed the return type of ``.load()`` methods (#106) Quentin Gallouédec 2022-10-10 17:21:38 +0200
  • dec7b5303a
    Deprecate ``create_eval_env``, ``eval_env`` and ``eval_freq`` parameter (#105) Quentin Gallouédec 2022-10-10 17:12:40 +0200
  • 2490468b11
    Release v1.6.1 (#104) Antonin RAFFIN 2022-09-29 12:30:12 +0200
  • cad9034fdb
    Handle batch norm in target update (#99) Honglu Fan 2022-08-27 04:31:00 -0600
  • 7993b75781
    Support `device="auto"`for buffers and set it as default value (#98) Quentin Gallouédec 2022-08-24 09:48:18 +0200
  • 049f5a16e9
    Fixed missing verbose parameter passing (#97) Burak Demirbilek 2022-08-16 16:54:46 +0300
  • eb48fec638
    Maskable eval callback call callback fix (#93) CppMaster 2022-07-27 19:52:07 +0200
  • fc68af8841
    Fixed shared_lstm argument in CNN and MultiInput Policies for RecurrentPPO (#90) Max Lodel 2022-07-26 00:27:17 +0200
  • 7e687ac47c
    Use higher resolution time_ns() and avoid division by zero (#91) Adam Gleave 2022-07-25 14:12:20 -0700
  • 3cbd2429be
    Fix returned type in predict (#88) Quentin Gallouédec 2022-07-18 11:49:03 +0200
  • c9d621b816
    Use ICRL url for PPO blog post Antonin Raffin 2022-07-12 23:49:26 +0200
  • 5ec9e01b44
    Update changelog Antonin Raffin 2022-07-12 23:15:14 +0200
  • 087951d34b
    Release v1.6.0 and bug fix for TRPO (#84) Antonin RAFFIN 2022-07-12 23:12:24 +0200
  • db4c0114d0
    Update default TQC net arch when using NatureCnn (#79) Antonin RAFFIN 2022-06-18 10:53:29 +0200
  • bfa86ce4fe
    Fix masked quantities in RecurrentPPO (#78) rnederstigt 2022-06-13 16:00:40 +0200
  • 75b2de1399
    Recurrent PPO (#53) Antonin RAFFIN 2022-05-29 22:31:12 -0400
  • cd592a111f
    Upgrade min SB3 version (#70) Antonin RAFFIN 2022-05-29 15:54:23 -0400
  • bec00386d1
    Upgrade to python 3.7+ syntax (#69) Antonin RAFFIN 2022-04-25 13:02:07 +0200
  • 812648e6cd
    Rename QRDQN logger key (#67) Antonin RAFFIN 2022-04-12 12:50:35 +0200
  • 99853265a9
    Using policy_aliases instead of register_policy (#66) Grégoire Passault 2022-04-08 15:36:23 -0400
  • 9d7e33d213
    Release v1.5.0 (#64) Antonin RAFFIN 2022-03-25 15:04:53 +0100
  • f5c1aaa194
    Allow PPO to turn off advantage normalization (#61) Costa Huang 2022-02-23 04:11:16 -0500
  • 901a648507
    Upgrade Gym to 0.21 (#59) Adam Gleave 2022-02-22 15:25:43 +0000
  • a78891bd00
    Update release date Antonin Raffin 2022-01-19 13:52:30 +0100
  • 89f2bae9f6
    Release 1.4.0 (#57) Antonin RAFFIN 2022-01-19 13:50:56 +0100
  • 675304d8fa
    Augmented Random Search (ARS) (#42) Sean Gillen 2022-01-18 04:57:27 -0800
  • 3b007ae93b
    Fix TRPO doc Antonin Raffin 2021-12-29 15:03:51 +0100
  • 59be198da0
    Add Trust Region Policy Optimization (TRPO) (#40) Cyprien 2021-12-29 10:58:03 +0000
  • b44689b0ea
    Update Maskable PPO to match SB3 PPO + improve coverage (#56) Antonin RAFFIN 2021-12-10 12:48:19 +0100
  • 20b5351086
    Add color in the tests Antonin Raffin 2021-12-10 12:38:40 +0100
  • 833669a88b
    Drop python 3.6 support (#55) Antonin RAFFIN 2021-12-06 12:59:53 +0100
  • a1b5ea67ae
    Multiprocessing support for off policy algorithms (#50) Antonin RAFFIN 2021-12-02 10:40:21 +0100
  • cd0a5e516f
    Update citation (#54) Antonin RAFFIN 2021-12-01 19:09:32 +0100
  • b1397bbb72
    Release 1.3.0 (#48) Antonin RAFFIN 2021-10-23 17:21:22 +0200
  • d6c5cea644
    MaskablePPO dictionary observation support (#47) Geoff McDonald 2021-10-23 08:05:37 -0700
  • 91f9b1ed34
    Remove sde net arch (#44) Antonin RAFFIN 2021-09-28 21:59:59 +0200
  • c525c5107b Upgrade min sphinx version Antonin Raffin 2021-09-23 15:26:37 +0200
  • ab24f8039f
    PPO variant with invalid action masking (#25) kronion 2021-09-23 07:50:10 -0500
  • b2e7126840
    Train/Eval Mode Support (#39) Scott Brownlie 2021-09-08 11:54:50 +0100
  • 36eca8ee79
    Fix type annotation + add python 3.9 + citation (#37) Antonin RAFFIN 2021-07-29 18:14:03 +0200