Margarita Chli


Loading...

Last Name

Chli

First Name

Margarita

Organisational unit

Search Results

Publications 1 - 10 of 79
  • Chen, Zetao; Maffra, Fabiola; Sa, Inkyu; et al. (2017)
    2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
    Recently, image representations derived from Convolutional Neural Networks (CNNs) have been demonstrated to achieve impressive performance on a wide variety of tasks, including place recognition. In this paper, we take a step deeper into the internal structure of CNNs and propose novel CNN-based image features for place recognition by identifying salient regions and creating their regional representations directly from the convolutional layer activations. A range of experiments is conducted on challenging datasets with varied conditions and viewpoints. These reveal superior precision-recall characteristics and robustness against both viewpoint and appearance variations for the proposed approach over the state of the art. By analyzing the feature encoding process of our approach, we provide insights into what makes an image presentation robust against external variations.
  • Teixeira, Lucas; Chli, Margarita (2017)
    Proceedings - IEEE International Conference on Robotics and Automation
  • Schmuck, Patrik; Ziegler, Thomas; Karrer, Marco; et al. (2021)
    2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
    Collaborative SLAM enables a group of agents to simultaneously co-localize and jointly map an environment, thus paving the way to wide-ranging applications of multi-robot perception and multi-user AR experiences by eliminating the need for external infrastructure or pre-built maps. This article presents COVINS, a novel collab- orative SLAM system, that enables multi-agent, scalable SLAM in large environments and for large teams of more than 10 agents. The paradigm here is that each agent runs visual-inertial odomety independently onboard in order to ensure its autonomy, while shar- ing map information with the COVINS server back-end running on a powerful local PC or a remote cloud server. The server back- end establishes an accurate collaborative global estimate from the contributed data, refining the joint estimate by means of place recog- nition, global optimization and removal of redundant data, in order to ensure an accurate, but also efficient SLAM process. A thorough evaluation of COVINS reveals increased accuracy of the collab- orative SLAM estimates, as well as efficiency in both removing redundant information and reducing the coordination overhead, and demonstrates successful operation in a large-scale mission with 12 agents jointly performing SLAM.
  • Rumley, Simon; Thoma, Anastasia; Beardsley, Paul A.; et al. (2023)
    With the emergence of robots driving or flying under the canopy of agricultural environments, localization becomes a problem given that both GPS and traditional, sparse feature-based place recognition perform poorly in such environments. This paper proposes an approach, which converts imagery from an agricultural orchard taken at the below-canopy level into bird's-eye-view imagery, essentially generating a top view of the field indicating the tree positions in the horizontal plane. This is a step towards registering low- and high-altitude imagery. Existing state-of-the-art learning-based methods for such tasks, known as Perspective View to Bird’s Eye View (PV2BEV) exist for urban scenes, particularly in the self-driving vehicle domain. Here, existing methods are evaluated in an agricultural setting, which poses notable challenges due to lack of structure and variability. We create high-quality synthetic datasets for the training of networks. We show preliminary evaluations on both synthetic and real imagery.
  • Mascaro, Ruben; Teixeira, Lucas; Chli, Margarita (2022)
    IEEE Robotics and Automation Letters
    Robots operating in real-world settings often need to plan interactions with surrounding scene elements and therefore, it is crucial for them to understand their workspace at the level of individual objects. In this spirit, this work presents a novel approach to progressively build instance-level, dense 3D maps from color and depth cues acquired by either a moving RGB-D sensor or a camera-LiDAR setup, whose pose is being tracked. The proposed framework processes each input RGB image with a semantic instance segmentation neural network and uses depth information to extract a set of per-frame, semantically labeled 3D instance segments, which then get matched to object instances already identified in previous views. Following integration of these newly detected instance segments in a global volumetric map, an efficient label diffusion scheme that considers multi-view instance predictions together with the reconstructed scene geometry is used to refine 3D segmentation boundaries. Experiments on indoor benchmarking RGB-D sequences show that the proposed system achieves state-of-the-art performance in terms of 3D segmentation accuracy, while reducing the computational processing cost required at each frame. Furthermore, the applicability of the system to challenging domains outside the traditional office scenes is demonstrated by testing it on a robotic excavator equipped with a calibrated camera-LiDAR setup, with the goal of segmenting individual boulders in a highly cluttered construction scenario.
  • Alzugaray, Ignacio; Chli, Margarita (2019)
    2019 International Conference on 3D Vision (3DV)
    With the emergence of event cameras, increasing research effort has been focusing on processing the asynchronous stream of events. With each event encoding a discrete intensity change at a particular pixel, uniquely time-stamped with high accuracy, this sensing information is so fundamentally different to the data provided by traditional frame-based cameras that most of the well-established vision algorithms are not applicable. Inspired by the need of effective event-based tracking, this paper addresses the tracking of generic patch features relying solely on events, while exploiting their asynchronicity and high-temporal resolution. The proposed approach outperforms the state-of-the-art in event-based feature tracking on well-established event camera datasets, retrieving longer and more accurate feature tracks at higher a frequency. Considering tracking as an optimization problem of matching the current view to a feature template, the proposed method implements a simple and efficient technique that only requires the evaluation of a discrete set of tracking hypotheses.
  • Bänninger, Philipp; Alzugaray, Ignacio; Karrer, Marco; et al. (2023)
    2023 IEEE International Conference on Robotics and Automation (ICRA)
    State-of-the-art decentralized collaborative Simultaneous Localization And Mapping (SLAM) systems crucially lack the ability to effectively use well-mapped areas generated by other agents in the team for relocalization. This often leads to map redundancy between agents, inefficient communication, and the need for costly re-mapping of areas previously mapped by other agents. In this work, we propose a strategy to efficiently share the areas mapped by different agents in a collaborative, decentralized SLAM system. This approach directly addresses map redundancy while maintaining the consistency of the estimates across the agents and keeping the overall system scalable in terms of cross-agent communication and individual computational effort. Our method leverages covisibility information between keyframes instantiated by different agents to transfer local sub-maps on-the-fly in a completely decentralized, peer-to-peer fashion. A globally consistent estimate is achieved by solving a distributed bundle adjustment problem using the Alternating Direction Method of Multipliers (ADMM), where we enforce constraints on shared map points and keyframes across agents.
  • Keller, Michel; Chen, Zetao; Maffra, Fabiola; et al. (2018)
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
    Research on learning suitable feature descriptors for Computer Vision has recently shifted to deep learning where the biggest challenge lies with the formulation of appropriate loss functions, especially since the descriptors to be learned are not known at training time. While approaches such as Siamese and triplet losses have been applied with success, it is still not well understood what makes a good loss function. In this spirit, this work demonstrates that many commonly used losses suffer from a range of problems. Based on this analysis, we introduce mixed-context losses and scale-aware sampling, two methods that when combined enable networks to learn consistently scaled descriptors for the first time.
  • Maffra, Fabiola; Teixeira, Lucas; Chen, Zetao; et al. (2017)
    2017 International Conference on 3D Vision (3DV)
  • Gao, Chen; Daxinger, Franz; Roth, Lukas; et al. (2024)
    2024 IEEE International Conference on Robotics and Automation (ICRA)
    Satellite imagery has traditionally been used to collect crop statistics, but its low resolution and registration accuracy limit agricultural analytics to plant stand levels and large areas. Precision agriculture seeks analytic tools at near single plant level, and this work explores how to improve aerial photogrammetry to enable inter-day precision agriculture analytics for intervals of up to a month. Our work starts by presenting an accurately registered image time series, captured up to twice a week, by an unmanned aerial vehicle over a wheat crop field. The dataset is registered using photogrammetry aided by fiducial ground control points (GCPs). Unfortunately, GCPs severely disrupt crop management activities. To address this, we propose a novel inter-day registration approach that only relies once on GCPs, at the beginning of the season. The method utilises LoFTR, a state-of-the-art image matching transformer. The original LoFTR network was trained using imagery of outdoor man-made scenes. One of the contributions is to extend LoFTR training method from matching images of a static scene to a dynamic scene of plants undergoing growth. Another contribution is the overall evaluation of our registration method that combines intra-day reconstruction and results from previous days in a seven degree-of-freedom alignment. The results show the benefits of our approach against other matching algorithms and the importance of retraining using crop scenes, particularly using our custom training method for growing crops that achieve an average of 27 cm error across the season.
Publications 1 - 10 of 79