Show simple item record

dc.contributor.author
Beuchat, Paul N.
dc.contributor.supervisor
Lygeros, John
dc.contributor.supervisor
Rantzer, Anders
dc.contributor.supervisor
Georghiou, Angelos
dc.date.accessioned
2021-03-01T08:01:39Z
dc.date.available
2020-02-21T15:38:12Z
dc.date.available
2020-02-27T15:11:11Z
dc.date.available
2020-02-28T07:17:29Z
dc.date.available
2021-03-01T08:01:39Z
dc.date.issued
2019-10
dc.identifier.isbn
978-3-906916-88-0
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/401151
dc.identifier.doi
10.3929/ethz-b-000401151
dc.description.abstract
Many problems in science and engineering can be cast as discrete time, continuous space, infinite horizon stochastic optimal control problems. The so-called value function and Q-function both characterise the solution of such problems, but are both intractable to compute in all but a few special cases. This thesis focuses on methods that approximate the value function and Q-function. We consider the linear programming approach to approximate dynamic programming, which computes approximate value functions and Q-functions that are point-wise under-estimators of the optimal by using the so-called Bellman inequality. For this approximation method we provide theoretical guarantees on the value function and Q-function approximation error, and also for the sub-optimality of a policy generated using such lower-bounding approximations. In particular, the online performance guarantee is obtained by analysing an iterated version of the greedy policy, and the fitting error guarantee by analysing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. Given a collection of lower-bounding approximate value functions, an improved approximation can be constructed by taking the point-wise maximum over the collection, however, the challenge is how to compute the collection itself. To address this challenge, we introduce a novel formulation, referred to as the point-wise maximum approach to approximate dynamic programming, and use this to propose algorithms that iteratively construct a collection of lower-bounding value functions with the objective of maximising the point-wise maximum of the collection. We empirically demonstrate the advantages of the proposed algorithm through a range numerical examples that indicate classes of problems where the proposed algorithms improves upon state-of-the-art methods. A key result from the numerical studies is that the proposed algorithm can provide practically useful sub-optimality bounds for the online performance of any policy, even when the collection of approximate value functions is itself impractical to use for a policy.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
approximate dynamic programming
en_US
dc.subject
stochastic optimal control
en_US
dc.subject
dynamic programming
en_US
dc.subject
infinite dimensional linear programming
en_US
dc.subject
iterative methods
en_US
dc.title
Approximate Dynamic Programming: theoretical guarantees and practical algorithms for a continuous space setting
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2020-02-28
ethz.size
202 p.
en_US
ethz.code.ddc
DDC - DDC::6 - Technology, medicine and applied sciences::620 - Engineering & allied operations
en_US
ethz.identifier.diss
26165
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02650 - Institut für Automatik / Automatic Control Laboratory::03751 - Lygeros, John / Lygeros, John
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02650 - Institut für Automatik / Automatic Control Laboratory::03751 - Lygeros, John / Lygeros, John
en_US
ethz.date.deposited
2020-02-21T15:38:22Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.date.embargoend
2021-02-28
ethz.rosetta.installDate
2020-02-28T07:17:39Z
ethz.rosetta.lastUpdated
2022-03-29T05:30:08Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Approximate%20Dynamic%20Programming:%20theoretical%20guarantees%20and%20practical%20algorithms%20for%20a%20continuous%20space%20setting&rft.date=2019-10&rft.au=Beuchat,%20Paul%20N.&rft.isbn=978-3-906916-88-0&rft.genre=unknown&rft.btitle=Approximate%20Dynamic%20Programming:%20theoretical%20guarantees%20and%20practical%20algorithms%20for%20a%20continuous%20space%20setting
 Search print copy at ETH Library

Files in this item

Thumbnail
Thumbnail

Publication type

Show simple item record