-
Notifications
You must be signed in to change notification settings - Fork 223
Description
I would like to contribute Beyond the Rainbow (BTR) to Stable-Baselines3 which improves over Rainbow Deep Q-Network (DQN) with six improvements from across the RL literature and is designed with computational efficiency in mind to train on high-end desktop PCs.
Paper: https://arxiv.org/abs/2411.03820
Code: https://github.com/VIPTankz/BTR
Background
Beyond the Rainbow (BTR) is an image-based RL algorithm with a discrete action space that improves over Rainbow DQN by adding 6 further improvements, namely, Impala (Scale=2), Adaptive Maxpooling (6x6), Spectral Normalization, Implicit Quantile Networks, Munchausen and Vectorized Environments. The algorithm has stated to gain traction (https://scholar.google.com/scholar?cites=3310089883274021659).
BTR is competitive with recent algorithms like Dreamer-v3 (Hafner et al., 2023) or MEME (Kapturowski et al.,2023) considering its focus on training in more resource restricted environments like desktop PCs. The algorithm has been benchmarked by training on a high-end desktop PC, achieving a human-normalized interquartile mean (IQM) of 7.4 on Atari-60 within 12 hours.
The implementation is based on PyToch.
Benefits
- Provide a state-of-the-art algorithm that provides the capability to train on high-end desktop which is of interest to smaller research labs and hobbyist who won’t have access to the hardware to train with more resource intensive algorithms.
- BTR can handle complex 3D games and has been used to train agents for Super Mario Galaxy, Mario Kart and Mortal Kombat (https://www.youtube.com/playlist?list=PL4geUsKi0NN-sjbuZP_fU28AmAPQunLoI) gaining interest from a community around building agents for games.
Practical Details
I will be working with the original author Tyler Clark to ensure that the SB3 implementation will achieve the performance stated in the paper.