Neural Scene Representations for 3D Reconstruction and Scene Understanding


Loading...

Author / Producer

Date

2023

Publication Type

Doctoral Thesis

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

In an era where machines are increasingly integrated into our daily lives, their ability to perceive and understand the three-dimensional world becomes of great importance. Central to this capability is the scene representation, which translates sensory data into compact, detailed, and holistic descriptions of the environment. While deep learning, particularly Convolutional Neural Networks (CNNs), has revolutionized many facets of computer vision, its primary focus remains on 2D information. This thesis delves into the challenges and potential of transitioning these technologies to 3D environments, aiming to bridge the gap between machine perception and human-like spatial understanding. Our primary objective is to pioneer the development of neural scene representations tailored for accurate 3D reconstruction and comprehensive 3D scene understanding. We start by introducing a scalable scene representation tailored for deep-learning-based 3D reconstruction. This representation is capable of capturing 3D shapes in a continuous, resolution-agnostic fashion, effectively addressing the constraints of traditional explicit-based methods. Next, by incorporating a differentiable point-to-mesh layer, we present a lightweight representation that ensures high-quality reconstruction with rapid inference, addressing the need for speed in real-world applications. Furthermore, we explore the realm of dense visual Simultaneous Localization and Mapping (SLAM) with a system that employs hierarchi- cal neural implicit representations. This approach enables detailed reconstruction in large-scale indoor scenarios, pushing the boundaries of what’s achievable with current SLAM systems. Lastly, our research culminates in the development of a unified scene representation for a broad spectrum of 3D scene understanding tasks, bypassing the need for costly 3D labeled data. In conclusion, this thesis presents a series of advancements in neural scene representations, offering solutions that not only enhance 3D reconstruction capabilities but elevate the level of 3D scene understanding, bringing us a step closer to achieving machine perception that mirrors human cognition.

Publication status

published

Editor

Contributors

Examiner : Pollefeys, Marc
Examiner : Geiger, Andreas
Examiner : Sitzmann, Vincent
Examiner : Guibas, Leonidas J.

Book title

Journal / series

Volume

Pages / Article No.

Publisher

ETH Zurich

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

03766 - Pollefeys, Marc / Pollefeys, Marc check_circle

Notes

Funding

Related publications and datasets