Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL


METADATA ONLY
Loading...

Date

2024

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity. Notably, P-MBED measures the complexity of the single-agent model class converted from the given mean-field model class, and potentially, can be exponentially lower than the MBED proposed by Huang et al. (2024). We contribute a model elimination algorithm featuring a novel exploration strategy and establish sample complexity results polynomial w.r.t. P-MBED. Crucially, our results reveal that, under the basic realizability and Lipschitz continuity assumptions, learning Nash Equilibrium in MFGs is no more statistically challenging than solving a logarithmic number of single-agent RL problems. We further extend our results to Multi-Type MFGs, generalizing from conventional MFGs and involving multiple types of agents. This extension implies statistical tractability of a broader class of Markov Games through the efficacy of mean-field approximation. Finally, inspired by our theoretical algorithm, we present a heuristic approach with improved computational efficiency and empirically demonstrate its effectiveness.

Publication status

published

Book title

Proceedings of the 41st International Conference on Machine Learning

Volume

235

Pages / Article No.

19816 - 19870

Publisher

PMLR

Event

41st International Conference on Machine Learning (ICML 2024)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

09729 - He, Niao / He, Niao check_circle

Notes

Funding

207343 - RING: Robust Intelligence with Nonconvex Games (SNF)
180545 - NCCR Automation (phase I) (SNF)
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)

Related publications and datasets