Python Contrib Package for Experimental Reinforcement Learning

Reverse engineered prompt

Build me a Python contrib package for Stable Baselines3 that gives researchers and RL hobbyists easy access to experimental reinforcement learning methods.

I want it to feel like Stable Baselines3, simple to install, simple to import, and easy to use with normal Gym style environments. Include working implementations for ARS, QR DQN, Maskable PPO for invalid action masking, Recurrent PPO with LSTM, TQC, TRPO, CrossQ, plus a time feature wrapper. Each method should support the usual train, predict, save, and load workflow, with clear examples so someone can copy a few lines and start training.

Please add tests, basic documentation, and small example scripts that show the algorithms running on simple environments. Keep the code clean and practical, but it’s okay that these are experimental algorithms. If you need exact compatibility details, look up the current Stable Baselines3 docs online.

Want more depth? Deep Reverse

Stable-Baselines-Team/stable-baselines3-contrib — reverse-engineered prompt

Reverse engineered prompt