stable-baselines3-contrib-sacd

Commit Graph

Author	SHA1	Message	Date
Antonin RAFFIN	a1b5ea67ae	Multiprocessing support for off policy algorithms (#50 ) * TQC support for multienv * Add optional layer norm for TQC * Add layer nprm for all policies * Revert "Add layer nprm for all policies" This reverts commit 1306c3c64eb12613464982c66cb416a3bbc66285. * Revert "Add optional layer norm for TQC" This reverts commit 200222e3a8878007aa6032d540ae74274a4d0788. * Add experimental support to train off-policy algorithms with multiple envs * Bump version * Update version	2021-12-02 10:40:21 +01:00
Antonin RAFFIN	91f9b1ed34	Remove sde net arch (#44 )	2021-09-28 21:59:59 +02:00
Toshiki Watanabe	b30397fff5	Add QR-DQN (#13 ) * Add QR-DQN(WIP) * Update docstring * Add quantile_huber_loss * Fix typo * Remove unnecessary lines * Update variable names and comments in quantile_huber_loss * Fix mutable arguments * Update variable names * Ignore import not used warnings * Fix default parameter of optimizer in QR-DQN * Update quantile_huber_loss to have more reasonable interface * update tests * Add assertion to quantile_huber_loss * Update variable names of quantile regression * Update comments * Reduce the number of quantiles during test * Update comment * Update quantile_huber_loss * Fix isort * Add document of QR-DQN without results * Update docs * Fix bugs * Update doc * Add comments about shape * Minor edits * Update comments * Add benchmark * Doc fixes * Update doc * Bug fix in saving/loading + update tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2020-12-21 11:17:48 +01:00
Antonin RAFFIN	72fe9a2072	Faster tests	2020-10-17 17:06:11 +02:00
Antonin RAFFIN	afe7b132e4	Lint	2020-10-12 20:25:11 +02:00
Antonin RAFFIN	5d7b79d41a	Improve coverage	2020-10-12 20:17:33 +02:00
Antonin RAFFIN	7609c87e84	Cleanup TQC	2020-10-12 19:50:08 +02:00
Antonin RAFFIN	0d9f2e229e	Add TQC and base scripts	2020-09-25 12:47:45 +02:00

8 Commits