Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
METADATA ONLY
Loading...
Author / Producer
Date
2024
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity. Notably, P-MBED measures the complexity of the single-agent model class converted from the given mean-field model class, and potentially, can be exponentially lower than the MBED proposed by Huang et al. (2024). We contribute a model elimination algorithm featuring a novel exploration strategy and establish sample complexity results polynomial w.r.t. P-MBED. Crucially, our results reveal that, under the basic realizability and Lipschitz continuity assumptions, learning Nash Equilibrium in MFGs is no more statistically challenging than solving a logarithmic number of single-agent RL problems. We further extend our results to Multi-Type MFGs, generalizing from conventional MFGs and involving multiple types of agents. This extension implies statistical tractability of a broader class of Markov Games through the efficacy of mean-field approximation. Finally, inspired by our theoretical algorithm, we present a heuristic approach with improved computational efficiency and empirically demonstrate its effectiveness.
Permanent link
Publication status
published
Book title
Proceedings of the 41st International Conference on Machine Learning
Journal / series
Volume
235
Pages / Article No.
19816 - 19870
Publisher
PMLR
Event
41st International Conference on Machine Learning (ICML 2024)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09729 - He, Niao / He, Niao
Notes
Funding
207343 - RING: Robust Intelligence with Nonconvex Games (SNF)
180545 - NCCR Automation (phase I) (SNF)
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
180545 - NCCR Automation (phase I) (SNF)
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
Related publications and datasets
Is new version of: 10.48550/ARXIV.2402.05724Is new version of: https://openreview.net/forum?id=4ye2I5OelI