Denis Tarasov


Loading...

Last Name

Tarasov

First Name

Denis

Organisational unit

09689 - Katzschmann, Robert / Katzschmann, Robert

Search Results

Publications 1 - 1 of 1
  • Tarasov, Denis (2025)
    In this work, we explore the integration of Reinforcement Learning (RL) approaches within a scalable offline In-Context RL (ICRL) framework. None of the existing offline ICRL approaches optimizes RL objective while it’s usage is expected to benefit resulting agents performance. Through experiments across more than 150 datasets derived from GridWorld-based and MuJoCo environments, we demonstrate that optimizing RL objectives improves performance by approximately 30% on average compared to the widely established Transformer-based Algorithm Distillation (AD) baseline across various dataset coverages, structures, expertise levels, and environmental complexities. Moreover, RL-based approaches demonstrate twice better performance than AD when tested on a challenging XLand-MiniGrid environment with a tiny fraction of the available dataset. Our results also reveal that offline RL-based methods outperform online RL approaches in this setup which is not trivial finding due to the need to adapt to out-of-distribution tasks. These findings underscore the importance of aligning the learning objectives with RL’s reward-maximization goal and demonstrates that offline RL is a promising direction for applying in ICRL settings. Our findings demonstrate enough evidence that the future methods in offline ICRL should explicitly optimize RL objectives.
Publications 1 - 1 of 1