Search
Results
-
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)We tackle the problem of estimating a Manhattan frame, i.e. three orthogonal vanishing points, and the unknown focal length of the camera, leveraging a prior vertical direction. The direction can come from an Inertial Measurement Unit that is a standard component of recent consumer devices, e.g., smartphones. We provide an exhaustive analysis of minimal line configurations and derive two new 2-line solvers, one of which does not suffer ...Conference Paper -
Human from Blur: Human Pose Tracking from Blurry Images
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)We propose a method to estimate 3D human poses from substantially blurred images. The key idea is to tackle the inverse problem of image deblurring by modeling the forward problem with a 3D human model, a texture map, and a sequence of poses to describe human motion. The blurring process is then modeled by a temporal image aggregation step. Using a differentiable renderer, we can solve the inverse problem by backpropagating the pixel-wise ...Conference Paper -
RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Although point cloud registration has achieved remarkable advances in object-level and indoor scenes, large-scale registration methods are rarely explored. Challenges mainly arise from the huge point number, complex distribution, and outliers of outdoor LiDAR scans. In addition, most existing registration works generally adopt a two-stage paradigm: They first find correspondences by extracting discriminative local features and then leverage ...Conference Paper -
RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End Robust Estimation
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Robust estimation is a crucial and still challenging task, which involves estimating model parameters in noisy environments. Although conventional sampling consensus-based algorithms sample several times to achieve robustness, these algorithms cannot use data features and historical information effectively. In this paper, we propose RLSAC, a novel Reinforcement Learning enhanced SAmple Consensus framework for end-to-end robust estimation. ...Conference Paper -
Tracking by 3D Model Estimation of Unknown Objects in Videos
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Most model-free visual object tracking methods formulate the tracking task as object location estimation given by a 2D segmentation or a bounding box in each video frame. We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation, namely the textured 3D shape and 6DoF pose in each video frame. Our representation tackles a complex long-term dense correspondence ...Conference Paper -
GlueStick: Robust Image Matching by Sticking Points and Lines Together
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Line segments are powerful features complementary to points. They offer structural cues, robust to drastic viewpoint and illumination changes, and can be present even in texture-less areas. However, describing and matching them is more challenging compared to points due to partial occlusions, lack of texture, or repetitiveness. This paper introduces a new matching paradigm, where points, lines, and their descriptors are unified into a ...Conference Paper -
R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Dense 3D reconstruction and ego-motion estimation are key challenges in autonomous driving and robotics. Compared to the complex, multi-modal systems deployed today, multi-camera systems provide a simpler, low-cost alternative. However, camera-based 3D reconstruction of complex dynamic scenes has proven extremely difficult, as existing solutions often produce incomplete or incoherent results. We propose R3D3, a multi-camera system for ...Conference Paper -
IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Existing inverse rendering combined with neural rendering methods can only perform editable novel view synthesis on object-specific scenes, while we present intrinsic neural radiance fields, dubbed IntrinsicNeRF, which introduce intrinsic decomposition into the NeRF-based neural rendering method and can extend its application to room-scale scenes. Since intrinsic decomposition is a fundamentally under-constrained inverse problem, we propose ...Conference Paper -
LabelMaker3D: Automatic Semantic Label Generation from RGB-D Trajectories
(2024)Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire. This work introduces a fully automated 2D/3D labeling framework that, without any human intervention, can generate labels for RGB-D scans at equal (or better) level of accuracy than comparable manually annotated datasets such as ScanNet. Our approach is based on an ensemble of state-of-the-art segmentation models and 3D lifting through ...Conference Paper -
OpenMask3D: Open-Vocabulary 3D Instance Segmentation
(2023)We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets. This results in important limitations for real-world applications where one might need to perform tasks guided by novel, open-vocabulary queries related to a wide variety of objects. Recently, ...Conference Paper