Show simple item record

dc.contributor.author
Rothfuss, Jonas
dc.contributor.author
Sukhija, Bhavya
dc.contributor.author
Birchler, Tobias
dc.contributor.author
Kassraie, Parnian
dc.contributor.author
Krause, Andreas
dc.contributor.editor
Evans, Robin J.
dc.contributor.editor
Shpitser, Ilya
dc.date.accessioned
2024-01-31T12:21:24Z
dc.date.available
2024-01-15T22:01:02Z
dc.date.available
2024-01-31T12:21:24Z
dc.date.issued
2023
dc.identifier.issn
2640-3498
dc.identifier.uri
http://hdl.handle.net/20.500.11850/652922
dc.description.abstract
We study the problem of conservative off-policy evaluation (COPE) where given an offline dataset of environment interactions, collected by other agents, we seek to obtain a (tight) lower bound on a policy’s performance. This is crucial when deciding whether a given policy satisfies certain minimal performance/safety criteria before it can be deployed in the real world. To this end, we introduce HAMBO, which builds on an uncertainty-aware learned model of the transition dynamics. To form a conservative estimate of the policy’s performance, HAMBO hallucinates worst-case trajectories that the policy may take, within the margin of the models’ epistemic confidence regions. We prove that the resulting COPE estimates are valid lower bounds, and, under regularity conditions, show their convergence to the true expected return. Finally, we discuss scalable variants of our approach based on Bayesian Neural Networks and empirically demonstrate that they yield reliable and tight lower bounds in various continuous control environments.
en_US
dc.language.iso
en
en_US
dc.publisher
PMLR
en_US
dc.title
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
en_US
dc.type
Conference Paper
ethz.book.title
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
en_US
ethz.journal.title
Proceedings of Machine Learning Research
ethz.journal.volume
216
en_US
ethz.pages.start
1774
en_US
ethz.pages.end
1784
en_US
ethz.event
39th Conference on Uncertainty in Artificial Intelligence (UAI 2023)
en_US
ethz.event.location
Pittsburgh, PA, USA
en_US
ethz.event.date
July 31 - August 4, 2023
en_US
ethz.grant
Reliable Data-Driven Decision Making in Cyber-Physical Systems
en_US
ethz.grant
NCCR Automation (phase I)
en_US
ethz.publication.place
Cambridge, MA
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas
en_US
ethz.identifier.url
https://proceedings.mlr.press/v216/rothfuss23a.html
ethz.grant.agreementno
815943
ethz.grant.agreementno
180545
ethz.grant.fundername
EC
ethz.grant.fundername
SNF
ethz.grant.funderDoi
10.13039/501100000780
ethz.grant.funderDoi
10.13039/501100001711
ethz.grant.program
H2020
ethz.grant.program
NCCR full proposal
ethz.date.deposited
2024-01-15T22:01:03Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2024-01-31T12:21:25Z
ethz.rosetta.lastUpdated
2024-01-31T12:21:25Z
ethz.rosetta.exportRequired
true
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Hallucinated%20Adversarial%20Control%20for%20Conservative%20Offline%20Policy%20Evaluation&rft.jtitle=Proceedings%20of%20Machine%20Learning%20Research&rft.date=2023&rft.volume=216&rft.spage=1774&rft.epage=1784&rft.issn=2640-3498&rft.au=Rothfuss,%20Jonas&Sukhija,%20Bhavya&Birchler,%20Tobias&Kassraie,%20Parnian&Krause,%20Andreas&rft.genre=proceeding&rft.btitle=Proceedings%20of%20the%20Thirty-Ninth%20Conference%20on%20Uncertainty%20in%20Artificial%20Intelligence
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record