- Doctoral Thesis
Rights / licenseIn Copyright - Non-Commercial Use Permitted
Research on reconstruction of objects and environments in three dimensions has made great progress over the past decade. Applications such as building maps of environments, creating 3D models for robot manipulation, and generating digital content for use in movies, games and virtual environments all benefit from techniques that can reconstruct objects with high fidelity and accuracy. Robustness of the reconstruction pipeline can be improved by incorporating semantic information from the environment. Semantic information also supplements the reconstructed model enabling applications such as robot manipulation, measurement of class specific metrics, and artistic control of objects. However, a vast majority of this research, both reconstruction and semantic segmentation, is targeted towards human-made objects and environments. These are characterized by geometry that is easier to parametrize, and features such as corners and edges, that can be tracked reliably even from viewpoints that are far apart. On the other hand, natural structures such as trees, foliage and corals consist of elements that are self-similar, repetitive, non-parametric and semi-rigid, have self-occluding geometry and display limited variation in colour information. This renders it challenging to apply the techniques developed for human made objects in natural environments. The focus of this thesis is to develop algorithms to tackle some of these challenges and enable high quality reconstruction of natural structures. Understanding semantics helps mitigate some of these challenges. To this end we propose three algorithms for semantic segmentation of vegetation. The first algorithm proposes the use of features based on surface curvatures as the representation of local geometry. The second one aims to learn these features using a Convolutional Neural Network (CNN). The third method also uses CNNs but performs semantic segmentation in single frame RGB-D images, as opposed to full point clouds used in the first two approaches. As this approach learns features from partial observations of geometry, it can be used in improving the robustness of the reconstruction framework. Due to complexity in deriving accurate parametric models of the unstructured geometry, we take a data-driven approach in all the three algorithms and learn features directly from the data. Data required for this purpose is generated using state-of-the-art simulation software. Evaluation on real data shows the extent to which knowledge transfers from simulation to reality. Improving camera tracking paves way for better reconstruction accuracy. Given that traditional feature based approaches perform poorly in natural environments, we employ deep neural networks to learn robust features directly from the environment. Here, we push the data-driven approach to its limit and investigate if a deep neural network can learn to predict poses from input images through end-to-end learning. Finally, we extend the scope of the aforementioned techniques for underwater environments to facilitate scanning and reconstruction of coral reefs. We demonstrate underwater 3D capture using commodity depth cameras and present an algorithm to calibrate a camera and its housing in order to undo the distortions caused due to refraction. Show more
External linksSearch print copy at ETH Library
SubjectSemantic segmentation; Mapping; Deep Learning; Vegetation modeling; 3D Reconstruction
Organisational unit03737 - Siegwart, Roland Y. / Siegwart, Roland Y.
Related publications and datasets
NotesThis thesis was a joint collaboration with ETH Zurich and Disney Research.
MoreShow all metadata