Show simple item record

dc.contributor.author
Montoya, Juan M.
dc.contributor.author
Daunhawer, Imant
dc.contributor.author
Vogt, Julia E.
dc.contributor.author
Wiering, Marco
dc.contributor.editor
Rocha, Ana P.
dc.contributor.editor
Steels, Luc
dc.contributor.editor
van den Herik, Jaap
dc.date.accessioned
2021-07-26T13:42:16Z
dc.date.available
2021-07-04T03:08:08Z
dc.date.available
2021-07-26T12:20:55Z
dc.date.available
2021-07-26T13:42:16Z
dc.date.issued
2021
dc.identifier.isbn
978-989-758-484-8
en_US
dc.identifier.other
10.5220/0010237507520759
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/492877
dc.description.abstract
In the quest for efficient and robust learning methods, combining unsupervised state representation learning and reinforcement learning (RL) could offer advantages for scaling RL algorithms by providing the models with a useful inductive bias. For achieving this, an encoder is trained in an unsupervised manner with two state representation methods, a variational autoencoder and a contrastive estimator. The learned features are then fed to the actor-critic RL algorithm Proximal Policy Optimization (PPO) to learn a policy for playing Open AI's car racing environment. Hence, such procedure permits to decouple state representations from RL-controllers. For the integration of RL with unsupervised learning, we explore various designs for variational autoencoders and contrastive learning. The proposed method is compared to a deep network trained directly on pixel inputs with PPO. The results show that the proposed method performs slightly worse than directly learning from pixel inputs; however, it has a more stable learning curve, a substantial reduction of the buffer size, and requires optimizing 88% fewer parameters. These results indicate that the use of pre-trained state representations has several benefits for solving RL tasks. © 2021 by SCITEPRESS – Science and Technology Publications, Lda.
en_US
dc.language.iso
en
en_US
dc.publisher
SciTePress
en_US
dc.subject
Deep Reinforcement Learning
en_US
dc.subject
State Representation Learning
en_US
dc.subject
Variational Autoencoders
en_US
dc.subject
Constrastive Learning
en_US
dc.title
Decoupling State Representation Methods from Reinforcement Learning in Car Racing
en_US
dc.type
Conference Paper
ethz.book.title
Proceedings of the 13th International Conference on Agents and Artificial Intelligence
en_US
ethz.journal.volume
2
en_US
ethz.pages.start
752
en_US
ethz.pages.end
759
en_US
ethz.event
13th International Conference on Agents and Artificial Intelligence (ICAART)
en_US
ethz.event.location
online
en_US
ethz.event.date
February 4-6, 2021
en_US
ethz.identifier.wos
ethz.publication.place
Setubal
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2021-07-04T03:08:23Z
ethz.source
WOS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2021-07-26T13:42:23Z
ethz.rosetta.lastUpdated
2021-07-26T13:42:23Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Decoupling%20State%20Representation%20Methods%20from%20Reinforcement%20Learning%20in%20Car%20Racing&rft.date=2021&rft.volume=2&rft.spage=752&rft.epage=759&rft.au=Montoya,%20Juan%20M.&Daunhawer,%20Imant&Vogt,%20Julia%20E.&Wiering,%20Marco&rft.isbn=978-989-758-484-8&rft.genre=proceeding&rft_id=info:doi/10.5220/0010237507520759&rft.btitle=Proceedings%20of%20the%2013th%20International%20Conference%20on%20Agents%20and%20Artificial%20Intelligence
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record