Perception and Learning for Autonomous UAV Missions


Author / Producer

Date

2020

Publication Type

Doctoral Thesis

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

The progress in the research and development of Unmanned Aerial Vehicles (UAV) has been tremendous in the last decade, making drones a valuable tool to automate applications that are risky, monotonous, or even unachievable for human-crewed operations: UAVs promise to deliver medical supplies to remote places, count wildlife, or create overview images for damage assessment after natural catastrophes. Nevertheless, a large proportion of the research has concentrated on confined indoor spaces where motion capturing systems may provide nearly perfect state estimation, and a large variety of depth sensors is applicable. Instead, this dissertation focuses on the challenges that arise when the UAV leaves the controlled lab environment and has to cope with a limited payload capacity, noisy measurements, and vast unstructured scenes. Within the scope of this thesis, we combine machine and deep learning with computer vision to develop the core perception abilities that a robot needs for informed, autonomous decision-making. A fundamental competence that an autonomous UAV requires is the ability to locate itself in large-scale outdoor environments under severe light conditions. Global Navigation Satellite Systems (GNSS) may be used for localization in specific applications but the provided accuracy is limited, and multi-pathing or dropouts may occur next to mountainsides or during operations close to the ground. In our first publication, we propose a navigation system, consisting of a single camera and an Inertial Measurement Unit (IMU), that makes accurate optimization-based state estimation computationally feasible by utilizing a sliding-window estimator. The visual-inertial solution is able to provide smooth pose estimates close to the ground enabling otherwise risky maneuvers such as landings, take-offs, or fly-bys. For larger-scale, geo-referenced localization, pre-existing maps generated from satellite or UAV imagery have immense potential, yet the appearance and environmental changes can be significant. The second publication introduces a framework to generate geo-referenced elevation maps and orthoimages for real-time robotic applications. The closely related third publication combines a rendering engine with a deep learning-based image alignment algorithm that estimates the geo-referenced six degrees of freedom (DoF) camera pose even under substantial environmental and illumination variations. Furthermore, an autonomous UAV requires a reliable depth estimation to ensure environmental awareness and to detect and avoid unmapped or dynamic obstacles and navigate safely through unexplored, potentially cluttered environments. This capability becomes particularly crucial with the increasing amount of air traffic, induced by the surge of UAVs. However, for the vast majority of small-scale UAVs, available depth sensors are not deployable due to tight constraints on the payload, dimensions, price, power consumption, and stringent requirements on range and resolution. Motivated by the caveats of depth-from-motion principles with only a single camera, we design a novel multi-IMU multi-camera system for long-range depth estimation, particularly suited for fixed-wing UAVs. The non-rigid multi-view stereo baseline is estimated using inertial measurements, visual cues, and is enhanced with deep learning. The final part of this dissertation investigates autonomous landing site detection, deep learning-based human detection, and collaborative reconstruction, covering further essential perception capabilities, including scene understanding, object detection, and point cloud registration. Overall, this dissertation provides an extensive perception framework for UAVs, investigating crucial aspects for autonomous mission completion. The proposed algorithms are validated with realistic synthetic datasets, hardware-in-the-loop tests, or real-world experiments. Within the scope of the thesis, more than six different rotary-wing and fixed-wing platforms have been equipped with sensor systems and used in real-world missions. To accelerate the research in this field, source code of several algorithms and valuable datasets with different sensor modalities have been made publicly available to the community.

Publication status

published

Editor

Contributors

Examiner : Siegwart, Roland
Examiner: Chli, Margarita
Examiner : Shen, Shaojie

Book title

Journal / series

Volume

Pages / Article No.

Publisher

ETH Zurich

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Unmanned Aerial Vehicles; Computer Vision; Machine Learning; Deep Learning; Localization and Mapping; Long-range Depth Estimation; Landing Site Detection; Optical-Infrared Human Detection; Optical-Lidar Point cloud Alignment; Reference View Rendering for Localization; Vision-based Wing Modelling

Organisational unit

03737 - Siegwart, Roland Y. (emeritus) / Siegwart, Roland Y. (emeritus)

Notes

Funding

Related publications and datasets