Open access
Author
Date
2022Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
There is a large range of applications in which autonomous robots could take over dangerous tasks, help to reduce energy and resource use, or enable novel designs. However, many of these require robots to operate in our everyday environments instead of factory floors. A crucial challenge in autonomous robotics is therefore to enable robots to navigate and interact in these environments, which requires semantic scene understanding. To this end, data-driven algorithms, especially deep learning, have greatly improved the capabilities of machines to detect and identify objects. Yet, these methods fail in the presence of unseen object types. When robots are deployed to open-world environments, they need to operate autonomously also in the presence of unknown objects.
This thesis investigates the problem of robotic scene understanding in open-world, everyday environments. The proposed systems are able to identify unknown parts of a scene, and even adapt and improve their perception capabilities in these environments fully autonomously.
In a first step, this thesis introduces a benchmark to measure how well robotic perception methods can identify outliers. It focuses on anomalies for semantic segmentation in urban driving, where unknown categories should be correctly segmented from images. The benchmark reveals a significant gap between existing methods and the requirements of robotic systems. As a public benchmark, it also facilitates the quest to improve methods and achieve progress on the task of anomaly detection. Following this quest, this thesis proposes a method that systematically addresses different failure modes of semantic segmentation in the presence of anomalies. It detects anomalies by re-synthesizing a possible input image from the semantic segmentation and comparing the re-synthesized and the original input, while also taking predictive uncertainties into account.
Building on top of the detection of unknowns, the second part of the thesis investigates how robots can learn on their own from unknown environments. This thesis proposes a combination of self-supervision and continual learning to build robotic systems that can integrate knowledge gained from different environments. This is evaluated on a robot that localises in indoor spaces. The robot successfully adapts to the different environments without any supervision, and improves the more environments it explores. To then integrate self-supervision and anomaly detection, this thesis proposes a framework that allows robots to identify novel categories found in a deployment environment. The proposed framework integrates clustering, inference, and mapping to train a model such that it improves on known categories and additionally learns novel categories. Towards a general solution for a large range of scenarios, the framework integrates information from different modalities and introduces a method to optimize parameters without human input.
In conclusion, this thesis presents the necessary building blocks of a system that perpetually provides scene understanding, detects outliers, and improves itself. This is a fundamental shift from the dominant methodology of collecting large datasets to train fixed models, towards more dynamic and adaptive robotic perception systems. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000585663Publication status
publishedExternal links
Search print copy at ETH Library
Publisher
ETH ZurichSubject
semantic segmentation; robotics; Robotic perception; anomaly detection; Continual learningOrganisational unit
03737 - Siegwart, Roland Y. / Siegwart, Roland Y.
Related publications and datasets
Has part: https://doi.org/10.3929/ethz-b-000511402
More
Show all metadata
ETH Bibliography
yes
Altmetrics