Search
Results
-
HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World
(2023)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Building an interactive AI assistant that can perceive, reason, and collaborate with humans in the real world has been a long-standing pursuit in the AI community. This work is part of a broader research effort to develop intelligent agents that can interactively guide humans through performing tasks in the physical world. As a first step in this direction, we introduce HoloAssist, a large-scale egocentric human interaction dataset, where ...Conference Paper -
Intrinsicnerf: Learning intrinsic neural radiance fields for editable novel view synthesis
(2023)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Existing inverse rendering combined with neural rendering methods can only perform editable novel view synthesis on object-specific scenes, while we present intrinsic neural radiance fields, dubbed IntrinsicNeRF, which introduce intrinsic decomposition into the NeRF-based neural rendering method and can extend its application to room-scale scenes. Since intrinsic decomposition is a fundamentally under-constrained inverse problem, we propose ...Conference Paper -
Tracking by 3D Model Estimation of Unknown Objects in Videos
(2023)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Most model-free visual object tracking methods formulate the tracking task as object location estimation given by a 2D segmentation or a bounding box in each video frame. We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation, namely the textured 3D shape and 6DoF pose in each video frame. Our representation tackles a complex long-term dense correspondence ...Conference Paper -
Gluestick: Robust image matching by sticking points and lines together
(2023)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Line segments are powerful features complementary to points. They offer structural cues, robust to drastic viewpoint and illumination changes, and can be present even in texture-less areas. However, describing and matching them is more challenging compared to points due to partial occlusions, lack of texture, or repetitiveness. This paper introduces a new matching paradigm, where points, lines, and their descriptors are unified into a ...Conference Paper -
Human from Blur: Human Pose Tracking from Blurry Images
(2023)2023 IEEE/CVF International Conference on Computer Vision (ICCV)We propose a method to estimate 3D human poses from substantially blurred images. The key idea is to tackle the inverse problem of image deblurring by modeling the forward problem with a 3D human model, a texture map, and a sequence of poses to describe human motion. The blurring process is then modeled by a temporal image aggregation step. Using a differentiable renderer, we can solve the inverse problem by backpropagating the pixel-wise ...Conference Paper -
OpenScene: 3D Scene Understanding with Open Vocabularies
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision. We propose OpenScene, an alternative approach where a model predicts dense features for 3D scene points that are co-embedded with text and image pixels in CLIP feature space. This zero-shot approach enables task-agnostic training and open-vocabulary queries. For example, to perform SOTA zero-shot 3D semantic ...Conference Paper -
Four-view Geometry with Unknown Radial Distortion
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)We present novel solutions to previously unsolved problems of relative pose estimation from images whose calibration parameters, namely focal lengths and radial distortion, are unknown. Our approach enables metric reconstruction without modeling these parameters. The minimal case for reconstruction requires 13 points in 4 views for both the calibrated and uncalibrated cameras. We describe and implement the first solution to these minimal ...Conference Paper -
Removing Objects From Neural Radiance Fields
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Neural Radiance Fields (NeRFs) are emerging as a ubiquitous scene representation that allows for novel view synthesis. Increasingly, NeRFs will be shareable with other people. Before sharing a NeRF, though, it might be desirable to remove personal information or unsightly objects. Such removal is not easily achieved with the current NeRF editing frameworks. We propose a framework to remove objects from a NeRF representation created from ...Conference Paper -
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Line segments are ubiquitous in our human-made world and are increasingly used in vision tasks. They are complementary to feature points thanks to their spatial extent and the structural information they provide. Traditional line detectors based on the image gradient are extremely fast and accurate, but lack robustness in noisy images and challenging conditions. Their learned counterparts are more repeatable and can handle challenging ...Conference Paper -
3D Line Mapping Revisited
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)In contrast to sparse keypoints, a handful of line segments can concisely encode the high-level scene layout, as they often delineate the main structural elements. In addition to offering strong geometric cues, they are also omnipresent in urban landscapes and indoor scenes. Despite their apparent advantages, current line-based reconstruction methods are far behind their point-based counterparts. In this paper we aim to close the gap by ...Conference Paper