Search
Results
-
Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Guided depth map super-resolution (GDSR), as a hot topic in multi-modal image processing, aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene. The critical step of this task is to effectively extract domain-shared and domain-private RGB/depth features. In addition, three detailed issues, namely blurry edges, noisy surfaces, and over-transferred RGB ...Conference Paper -
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Multi-modality image fusion aims to combine different modalities to produce fused images that retain the complementary features of each modality, such as functional highlights and texture details. To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM). ...Conference Paper -
Self-Supervised Burst Super-Resolution
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)We introduce a self-supervised training strategy for burst super-resolution that only uses noisy low-resolution bursts during training. Our approach eliminates the need to carefully tune synthetic data simulation pipelines, which often do not match real-world image statistics. Compared to weakly-paired training strategies, which require noisy smartphone burst photos of static scenes, paired with a clean reference obtained from a tripod-mounted ...Conference Paper -
Dreamwalker: Mental Planning for Continuous Vision-Language Navigation
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)VLN-CE is a recently released embodied task, where AI agents need to navigate a freely traversable environment to reach a distant target location, given language instructions. It poses great challenges due to the huge space of possible strategies. Driven by the belief that the ability to anticipate the consequences of future actions is crucial for the emergence of intelligent and interpretable planning behavior, we propose DREAMWALKER - ...Conference Paper -
Introducing Language Guidance in Prompt-based Continual Learning
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Continual Learning aims to learn a single model on a sequence of tasks without having access to data from previous tasks. The biggest challenge in the domain still remains catastrophic forgetting: a loss in performance on seen classes of earlier tasks. Some existing methods rely on an expensive replay buffer to store a chunk of data from previous tasks. This, while promising, becomes expensive when the number of tasks becomes large or ...Conference Paper -
DiffIR: Efficient Diffusion Model for Image Restoration
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, ...Conference Paper -
DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Scene extrapolation-the idea of generating novel views by flying into a given image-is a promising, yet challenging task. For each predicted frame, a joint inpainting and 3D refinement problem has to be solved, which is ill posed and includes a high level of ambiguity. Moreover, training data for long-range scenes is difficult to obtain and usually lacks sufficient views to infer accurate camera poses. We introduce DiffDreamer, an ...Conference Paper -
Deformable Neural Radiance Fields using RGB and Event Cameras
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Modeling Neural Radiance Fields for fast-moving deformable objects from visual data alone is a challenging problem. A major issue arises due to the high deformation and low acquisition rates. To address this problem, we propose to use event cameras that offer very fast acquisition of visual change in an asynchronous manner. In this work, we develop a novel method to model the deformable neural radiance fields using RGB and event cameras. ...Conference Paper -
Source-free Depth for Object Pop-out
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Depth cues are known to be useful for visual perception. However, direct measurement of depth is often impracticable. Fortunately, though, modern learning-based methods offer promising depth maps by inference in the wild. In this work, we adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D. The "pop-out" is a simple composition prior that assumes objects reside on the background surface. Such ...Conference Paper -
Improving Online Lane Graph Extraction by Object-Lane Clustering
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Autonomous driving requires accurate local scene understanding information. To this end, autonomous agents deploy object detection and online BEV lane graph extraction methods as a part of their perception stack. In this work, we propose an architecture and loss formulation to improve the accuracy of local lane graph estimates by using 3D object detection outputs. The proposed method learns to assign the objects to center-lines by considering ...Conference Paper