Self-improving, Open-World Robotic Scene Understanding
OPEN ACCESS
Loading...
Author / Producer
Date
2022
Publication Type
Doctoral Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
There is a large range of applications in which autonomous robots could take over dangerous tasks, help to reduce energy and resource use, or enable novel designs. However, many of these require robots to operate in our everyday environments instead of factory floors. A crucial challenge in autonomous robotics is therefore to enable robots to navigate and interact in these environments, which requires semantic scene understanding. To this end, data-driven algorithms, especially deep learning, have greatly improved the capabilities of machines to detect and identify objects. Yet, these methods fail in the presence of unseen object types. When robots are deployed to open-world environments, they need to operate autonomously also in the presence of unknown objects.
This thesis investigates the problem of robotic scene understanding in open-world, everyday environments. The proposed systems are able to identify unknown parts of a scene, and even adapt and improve their perception capabilities in these environments fully autonomously.
In a first step, this thesis introduces a benchmark to measure how well robotic perception methods can identify outliers. It focuses on anomalies for semantic segmentation in urban driving, where unknown categories should be correctly segmented from images. The benchmark reveals a significant gap between existing methods and the requirements of robotic systems. As a public benchmark, it also facilitates the quest to improve methods and achieve progress on the task of anomaly detection. Following this quest, this thesis proposes a method that systematically addresses different failure modes of semantic segmentation in the presence of anomalies. It detects anomalies by re-synthesizing a possible input image from the semantic segmentation and comparing the re-synthesized and the original input, while also taking predictive uncertainties into account.
Building on top of the detection of unknowns, the second part of the thesis investigates how robots can learn on their own from unknown environments. This thesis proposes a combination of self-supervision and continual learning to build robotic systems that can integrate knowledge gained from different environments. This is evaluated on a robot that localises in indoor spaces. The robot successfully adapts to the different environments without any supervision, and improves the more environments it explores. To then integrate self-supervision and anomaly detection, this thesis proposes a framework that allows robots to identify novel categories found in a deployment environment. The proposed framework integrates clustering, inference, and mapping to train a model such that it improves on known categories and additionally learns novel categories. Towards a general solution for a large range of scenarios, the framework integrates information from different modalities and introduces a method to optimize parameters without human input.
In conclusion, this thesis presents the necessary building blocks of a system that perpetually provides scene understanding, detects outliers, and improves itself. This is a fundamental shift from the dominant methodology of collecting large datasets to train fixed models, towards more dynamic and adaptive robotic perception systems.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Siegwart, Roland
Examiner : Sünderhauf, Niko
Examiner : Posner, Ingmar
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
semantic segmentation; robotics; Robotic perception; anomaly detection; Continual learning
Organisational unit
03737 - Siegwart, Roland Y. / Siegwart, Roland Y.