RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning


METADATA ONLY
Loading...

Date

2025

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

Future robotic systems operating in real-world environments require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of small parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was originally developed to enable mathematical reasoning in LLMs using static datasets. We extend it to the robotics domain through integration with a closed-loop Reinforcement Learning (RL) framework. This extension allows reasoning in Embodied Artificial Intelligence (EmbodiedAI) settings without relying solely on distillation of large models through Supervised Fine-Tuning (SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which enables tasks that previously required significantly larger models. A performance gain of 20.2% points over the SFT-based baseline is observed with a Qwen2.5-1.5B model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3% control adaptability score, surpassing the 58.5% obtained by the much larger, cloud-bound GPT-4o. These results highlight that practical, on-board deployment of small LLMs is not only feasible but can outperform larger models when trained through environmental interaction, underscoring the importance of an interactive, embodied learning framework for robotic EmbodiedAI — one grounded in practical experience rather than static supervision.

Publication status

accepted

Editor

Book title

9th Annual Conference on Robot Learning (CoRL 2025)

Journal / series

Volume

Pages / Article No.

Publisher

Event

9th Conference on Robot Learning (CoRL 2025)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Large language models; Reinforcement learning; Embodied AI; Constrained hardware

Organisational unit

03996 - Benini, Luca / Benini, Luca check_circle
01225 - D-ITET Zentr. f. projektbasiertes Lernen / D-ITET Center for Project-Based Learning

Notes

Funding

Related publications and datasets