Metadata only
Datum
2024-07Typ
- Conference Paper
ETH Bibliographie
yes
Altmetrics
Abstract
In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the
Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space. Mehr anzeigen
Publikationsstatus
publishedExterne Links
Herausgeber(in)
Buchtitel
Advances in Neural Information Processing Systems 36Seiten / Artikelnummer
Verlag
CurranKonferenz
Organisationseinheit
09729 - He, Niao / He, Niao
Förderung
207343 - RING: Robust Intelligence with Nonconvex Games (SNF)
Zugehörige Publikationen und Daten
Is new version of: https://doi.org/10.48550/arXiv.2302.05534
Is new version of: https://openreview.net/forum?id=1WMdoiVMov
Anmerkungen
Poster presentation on December 12, 2023.ETH Bibliographie
yes
Altmetrics