Efficient algorithms and systems for multi-agent mapping and world-scale real-time localization
- Doctoral Thesis
Rights / licenseIn Copyright - Non-Commercial Use Permitted
Every second the world around us changes in appearance and geometry. Buildings get torn down, businesses open and close while light and shadow play throughout the day. But even low-frequency variation from seasonal change influence how the environment looks like and challenge us anew to recognize the path to work and back home. While building a mental model of our environment and recognizing it on our next visit seems effortless to us, teaching machines to do so is a huge undertaking. The main theme of this thesis is to research algorithms and systems that allow robots and mobile devices to understand their position in an environment over long-periods. With such understanding, we hope, robots can become more autonomous in building and maintaining their map of the environment s.t. they can collaborate and become more useful for humans. The goal of the research is the question of how we can efficiently capture the ever changing world around us s.t. we can localize agents in that model while concurrently updating it from newly captured data. A typical method for capturing the environment is to use photo collections and run "Structure from Motion, a class of algorithms which jointly estimates the position and orientation of the cameras at time of exposure as well as the geometry of the environment in the form of sparse 3d points. Each such 3d point corresponds to a salient part in the environment and is associated with a fingerprint that describes its appearance. The resulting model allows computers to recognize a scene by comparing visual fingerprints between the model and camera live stream. If we were able to build the model, identify changes in the environment and communicate updates to all other agents in real-time, the above posed question would be considered solved. Building the model and identifying the dynamic parts however is computationally costly and sharing the resulting model requires substantial network bandwidth. Most use-cases additionally require localization and state-estimation to run in real-time for user feedback or robotic control which adds computational load on the agent. We address these limitations and challenges by proposing a set of algorithms that span the cycle from map collection, change identification, compression of the environment to real-time localization. The proposed approaches target at systems that are resource constraint given the size of their operating environment such as found in robotics and mobile applications. Depending on the use-case this ranges from a single processor platforms to data-center sized infrastructure; localizing a group of robots or millions of mobile phones. In the first part of the thesis we present algorithms which provide real-time pose estimates which is the basis for navigating and capturing the environment. We start from a extended Kalman filter formulation and extend it with the capability to fuse multiple sensors with time delays and add robustness to sensor signal loss. To ensure constant and real-time performance the approach keeps only limited memory of the environment, which is sufficient to locally prevent positional drift. Given noisy sensor signals and modeling errors, the positioning accuracy of such approach however degrades as the platform moves through the environment. To correct for the resulting drift in pose we combine the pose estimator with a loop-closure module which is able to recognize previously visited places and thus can correct for accumulated error. We propose algorithms that change the definition of a place s.t. it becomes continuous instead of the arbitrary discretization applied previously in literature which improves recognition rates substantially. To accelerate the model-capturing process we combined the algorithms in a multi-agent mapping system where agents share data to collaboratively build a joint model of the environment. In our second project we take the information from multiple agents and fuse it into a common lifelong map to improve robustness and coverage. We investigate which signals can be used to identify overlapping, redundant or outdated information about the environment and propose optimization strategies to reduce the model size while maintaining localization performance. Combined with efficient indexing schemes we show that localization queries can run in real-time even against models spanning large environments such as entire cities. Our experiments demonstrate that artifacts from high compression rates can be mitigated by fusing real-time the localization signal with the on-board state estimation. The overall research is motivated by the vision to localize a large number of agents in a dynamic, real-world non lab environment while concurrently updating the model of it from new experience captured by the agents over their lifetime. As a proof of concept the algorithms form the foundation of Google's "Visual Positioning Service" where they allow the real-time localization of mobile phones for both small and large scale Augmented Reality applications. Show more
External linksSearch via SFX
ContributorsExaminer: Siegwart, Roland Y.
Examiner: Scarramuzza, Davide
SubjectROBOT VISION; ROBOT NAVIGATION; localization
Organisational unit03737 - Siegwart, Roland Y. / Siegwart, Roland Y.
MoreShow all metadata