Search

JavaScript is disabled for your browser. Some features of this site may not work without it.

Now showing items 1-10 of 342

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes

Recasens, David; Oswald, Martin R.; Pollefeys, Marc; et al. (2024)

Advances in Neural Information Processing Systems 36

Estimating camera motion in deformable scenes poses a complex and open research challenge. Most existing non-rigid structure from motion techniques assume to observe also static scene parts besides deforming scene parts in order to establish an anchoring reference. However, this assumption does not hold true in certain relevant application cases such as endoscopies. Deformable odometry and SLAM pipelines, which tackle the most challenging ...

Conference Paper

SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Sarlin, Paul-Edouard; Trulls, Eduard; Pollefeys, Marc; et al. (2024)

Advances in Neural Information Processing Systems 36

Semantic 2D maps are commonly used by humans and machines for navigation purposes, whether it's walking or driving. However, these maps have limitations: they lack detail, often contain inaccuracies, and are difficult to create and maintain, especially in an automated fashion. Can we use raw imagery to automatically create better maps that can be easily interpreted by both humans and machines? We introduce SNAP, a deep network that learns ...

Conference Paper

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

Takmaz, Ayça; Fedele, Elisabetta; Summer, Robert W.; et al. (2024)

Advances in Neural Information Processing Systems 36

We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets. This results in important limitations for real-world applications where one might need to perform tasks guided by novel, open-vocabulary queries related to a wide variety of objects. Recently, ...

Conference Paper

LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories

Weder, Silvan; Blum, Hermann; Engelmann, Francis; et al. (2024)

2024 International Conference on 3D Vision (3DV)

Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire. This work introduces a fully automated 2D/3D labeling framework that, without any human intervention, can generate labels for RGB-D scans at equal (or better) level of accuracy than comparable manually annotated datasets such as ScanNet. Our approach is based on an ensemble of state-of-the-art segmentation models and 3D lifting through ...

Conference Paper

Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature

Jin, Shengze; Barath, Daniel; Pollefeys, Marc; et al. (2024)

2024 International Conference on 3D Vision (3DV)

Point cloud registration has seen recent success with several learning-based methods that focus on correspondence matching, and as such, optimize only for this objective. Following the learning step of correspondence matching, they evaluate the estimated rigid transformation with a RANSAC-like framework. While it is an indispensable component of these methods, it prevents a fully end-to-end training, leaving the objective to minimize the ...

Conference Paper

NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM

Zhu, Zihan; Peng, Songyou; Larsson, Viktor; et al. (2024)

2024 International Conference on 3D Vision (3DV)

Neural implicit representations have recently become popular in simultaneous localization and mapping (SLAM), especially in dense visual SLAM. However, existing works either rely on RGB-D sensors or require a separate monocular SLAM approach for camera tracking, and fail to produce high-fidelity 3D dense reconstructions. To address these shortcomings, we present NICER-SLAM, a dense RGB SLAM system that simultaneously optimizes for camera ...

Conference Paper

Handbook on Leveraging Lines for Two-View Relative Pose Estimation

Hruby, Petr; Liu, Shaohui; Pautrat, Rémi; et al. (2024)

2024 International Conference on 3D Vision (3DV)

We propose an approach for estimating the relative pose between calibrated image pairs by jointly exploiting points, lines, and their coincidences in a hybrid manner. We investigate all possible configurations where these data modalities can be used together and review the minimal solvers available in the literature. Our hybrid framework combines the advantages of all configurations, enabling robust and accurate estimation in challenging ...

Conference Paper

OpenScene: 3D Scene Understanding with Open Vocabularies

Peng, Songyou; Genova, Kyle; Jiang, Chiyu; et al. (2023)

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision. We propose OpenScene, an alternative approach where a model predicts dense features for 3D scene points that are co-embedded with text and image pixels in CLIP feature space. This zero-shot approach enables task-agnostic training and open-vocabulary queries. For example, to perform SOTA zero-shot 3D semantic ...

Conference Paper

Four-view Geometry with Unknown Radial Distortion

Hruby, Petr; Korotynskiy, Viktor; Duff, Timothy; et al. (2023)

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

We present novel solutions to previously unsolved problems of relative pose estimation from images whose calibration parameters, namely focal lengths and radial distortion, are unknown. Our approach enables metric reconstruction without modeling these parameters. The minimal case for reconstruction requires 13 points in 4 views for both the calibrated and uncalibrated cameras. We describe and implement the first solution to these minimal ...

Conference Paper

Removing Objects From Neural Radiance Fields

Weder, Silvan; Garcia-Hernando, Guillermo; Monszpart, Áron; et al. (2023)

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Neural Radiance Fields (NeRFs) are emerging as a ubiquitous scene representation that allows for novel view synthesis. Increasingly, NeRFs will be shareable with other people. Before sharing a NeRF, though, it might be desirable to remove personal information or unsightly objects. Such removal is not easily achieved with the current NeRF editing frameworks. We propose a framework to remove objects from a NeRF representation created from ...

Conference Paper

Results

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes

SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories

Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature

NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM

Handbook on Leveraging Lines for Two-View Relative Pose Estimation

OpenScene: 3D Scene Understanding with Open Vocabularies

Four-view Geometry with Unknown Radial Distortion

Removing Objects From Neural Radiance Fields

Refine by

Research Collection

Search

Search

Results

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes ﻿

SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding ﻿

OpenMask3D: Open-Vocabulary 3D Instance Segmentation ﻿

LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories ﻿

Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature ﻿

NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM ﻿

Handbook on Leveraging Lines for Two-View Relative Pose Estimation ﻿

OpenScene: 3D Scene Understanding with Open Vocabularies ﻿

Four-view Geometry with Unknown Radial Distortion ﻿

Removing Objects From Neural Radiance Fields ﻿

Refine by

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes

SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories

Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature

NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM

Handbook on Leveraging Lines for Two-View Relative Pose Estimation

OpenScene: 3D Scene Understanding with Open Vocabularies

Four-view Geometry with Unknown Radial Distortion

Removing Objects From Neural Radiance Fields