Optimality in Distributed Control from Convex Programming to Reinforcement Learning

Open access
Author
Date
2020-09Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The recent years have witnessed a steadily increasing deployment of large-scale systems and infrastructures in several cyber-physical domains, ranging for instance from the continent-wide electricity grid to eCommerce platforms. It is expected that in the coming years large-scale systems will impact further aspects of our everyday lives: just to name a few examples, from mobility on demand and autonomous guidance to the study of biochemical reaction networks in medicine. One crucial question remains mostly unexplored: how do we optimally and safely operate these systems when their global behaviour depends on the coordination of multiple agents, none of which can observe the overall system?
This thesis intends to focus on the mathematical aspects of the matter presented above. Specifically, the goal of our study is to develop efficient algorithms to synthesize control policies that are optimal and safe. These policies must be based exclusively on measurements that are locally available to each agent; we refer to such control laws as optimal distributed controllers. Motivated by the lack of sufficiently accurate dynamical models for such complex dynamical systems, we additionally wish to investigate how these algorithms can operate without knowing the models. In this sense, we require that they are solely based on the observed data, or as commonly stated in the machine-learning field, that they are data-driven.
In the first part of this thesis, we focus on further refining our understanding of the challenges inherent to using convex programming as a tool for synthesizing globally optimal distributed controllers given a lack of full information. Our main contribution is to reinterpret the classical result of quadratic invariance (QI) to identify more complex decentralization schemes that achieve global optimality. We additionally include robust safety constraints. We finally suggest a new mathematical instrument that allows to parametrize controllers in an input-output way.
In the second part of this thesis we turn our attention to how one can obtain convex approximations in the presence of arbitrary information structures. Our key contribution is a novel method denoted as sparsity invariance (SI) for the synthesis of tight convex restrictions. The main advantages of SI are that i) SI recovers both previous global optimality results given QI and previous heuristics as special cases, ii) SI outperforms previous methods, as we illustrate through specific examples, and iii) SI is widely adaptable, as it can be naturally encapsulated inside significantly different control frameworks. We study applicability of SI and highlight its benefits on a real-world platooning scenario.
In the third part of this thesis, we investigate the gradient-descent landscape in order to assess the efficacy of reinforcement learning algorithms when the dynamical model is unknown. The milestone we achieve is to establish sample-complexity bounds to synthesize globally optimal distributed controllers when QI holds. We further show that gradient-descent ensures global optimality for a class of problems that is strictly larger than QI problems. To the best of our knowledge, our works are the first to address global optimality for distributed control in a data-driven setup. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000479163Publication status
publishedExternal links
Search print copy at ETH Library
Contributors
Examiner: Kamgarpour, Maryam
Examiner: Papachristodoulou, Antonis
Examiner: Ferrari-Trecate, Giancarlo
Publisher
ETH ZurichOrganisational unit
09578 - Kamgarpour, Maryam (ehemalig) / Kamgarpour, Maryam (former)
More
Show all metadata
ETH Bibliography
yes
Altmetrics