Implement Beyond the Rainbow (BTR) Algorithm

I would like to contribute Beyond the Rainbow (BTR) to Stable-Baselines3 which improves over Rainbow Deep Q-Network (DQN) with six improvements from across the RL literature and is designed with computational efficiency in mind to train on high-end desktop PCs. 

Paper: https://arxiv.org/abs/2411.03820 

Code: https://github.com/VIPTankz/BTR 

**Background** 

Beyond the Rainbow (BTR) is an image-based RL algorithm with a discrete action space that improves over Rainbow DQN by adding 6 further improvements, namely, Impala (Scale=2), Adaptive Maxpooling (6x6), Spectral Normalization, Implicit Quantile Networks, Munchausen and Vectorized Environments. The algorithm has stated to gain traction (https://scholar.google.com/scholar?cites=3310089883274021659). 

BTR is competitive with recent algorithms like Dreamer-v3 (Hafner et al., 2023) or MEME (Kapturowski et al.,2023) considering its focus on training in more resource restricted environments like desktop PCs. The algorithm has been benchmarked by training on a high-end desktop PC, achieving a human-normalized interquartile mean (IQM) of 7.4 on Atari-60 within 12 hours. 

The implementation is based on PyToch. 

**Benefits** 

- Provide a state-of-the-art algorithm that provides the capability to train on high-end desktop which is of interest to smaller research labs and hobbyist who won’t have access to the hardware to train with more resource intensive algorithms.
- BTR can handle complex 3D games and has been used to train agents for Super Mario Galaxy, Mario Kart and Mortal Kombat (https://www.youtube.com/playlist?list=PL4geUsKi0NN-sjbuZP_fU28AmAPQunLoI) gaining interest from a community around building agents for games. 

**Practical Details** 

I will be working with the original author Tyler Clark to ensure that the SB3 implementation will achieve the performance stated in the paper. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Beyond the Rainbow (BTR) Algorithm #314

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement Beyond the Rainbow (BTR) Algorithm #314

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions