Search

JavaScript is disabled for your browser. Some features of this site may not work without it.

Now showing items 1-10 of 648

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

Zhao, Zixiang; Zhang, Jiangshe; Gu, Xiang; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Guided depth map super-resolution (GDSR), as a hot topic in multi-modal image processing, aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene. The critical step of this task is to effectively extract domain-shared and domain-private RGB/depth features. In addition, three detailed issues, namely blurry edges, noisy surfaces, and over-transferred RGB ...

Conference Paper

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

Zhao, Zixiang; Bai, Haowen; Zhu, Yuanzhi; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Multi-modality image fusion aims to combine different modalities to produce fused images that retain the complementary features of each modality, such as functional highlights and texture details. To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM). ...

Conference Paper

Self-Supervised Burst Super-Resolution

Bhat, Goutam; Gharbi, Michaël; Chen, Jiawen; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

We introduce a self-supervised training strategy for burst super-resolution that only uses noisy low-resolution bursts during training. Our approach eliminates the need to carefully tune synthetic data simulation pipelines, which often do not match real-world image statistics. Compared to weakly-paired training strategies, which require noisy smartphone burst photos of static scenes, paired with a clean reference obtained from a tripod-mounted ...

Conference Paper

Dreamwalker: Mental Planning for Continuous Vision-Language Navigation

Wang, Hanqing; Liang, Wei; Van Gool, Luc; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely traversable environment to reach a distant target location, given language instructions. It poses great challenges due to the huge space of possible strategies. Driven by the belief that the ability to anticipate the consequences of future actions is crucial for the emergence of intelligent and interpretable planning behavior, we propose DREAMWALKER - ...

Conference Paper

Introducing Language Guidance in Prompt-based Continual Learning

Khan, Muhammad Gul Zain Ali; Naeem, Muhammad Ferjad; Van Gool, Luc; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Continual Learning aims to learn a single model on a sequence of tasks without having access to data from previous tasks. The biggest challenge in the domain still remains catastrophic forgetting: a loss in performance on seen classes of earlier tasks. Some existing methods rely on an expensive replay buffer to store a chunk of data from previous tasks. This, while promising, becomes expensive when the number of tasks becomes large or ...

Conference Paper

DiffIR: Efficient Diffusion Model for Image Restoration

Xia, Bin; Zhang, Yulun; Wang, Shiyin; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, ...

Conference Paper

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models

Cai, Shengqu; Chan, Eric Ryan; Peng, Songyou; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Scene extrapolation-the idea of generating novel views by flying into a given image-is a promising, yet challenging task. For each predicted frame, a joint inpainting and 3D refinement problem has to be solved, which is ill posed and includes a high level of ambiguity. Moreover, training data for long-range scenes is difficult to obtain and usually lacks sufficient views to infer accurate camera poses. We introduce DiffDreamer, an ...

Conference Paper

Deformable Neural Radiance Fields using RGB and Event Cameras

Ma, Qi; Paudel, Danda Pani; Chhatkuli, Ajad; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Modeling Neural Radiance Fields for fast-moving deformable objects from visual data alone is a challenging problem. A major issue arises due to the high deformation and low acquisition rates. To address this problem, we propose to use event cameras that offer very fast acquisition of visual change in an asynchronous manner. In this work, we develop a novel method to model the deformable neural radiance fields using RGB and event cameras. ...

Conference Paper

Source-free Depth for Object Pop-out

Wu, Zongwei; Paudel, Danda Pani; Fan, Deng-Ping; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Depth cues are known to be useful for visual perception. However, direct measurement of depth is often impracticable. Fortunately, though, modern learning-based methods offer promising depth maps by inference in the wild. In this work, we adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D. The "pop-out" is a simple composition prior that assumes objects reside on the background surface. Such ...

Conference Paper

Improving Online Lane Graph Extraction by Object-Lane Clustering

Can, Yigit Baran; Liniger, Alexander; Paudel, Danda Pani; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Autonomous driving requires accurate local scene understanding information. To this end, autonomous agents deploy object detection and online BEV lane graph extraction methods as a part of their perception stack. In this work, we propose an architecture and loss formulation to improve the accuracy of local lane graph estimates by using 3D object detection outputs. The proposed method learns to assign the objects to center-lines by considering ...

Conference Paper

Results

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

Self-Supervised Burst Super-Resolution

Dreamwalker: Mental Planning for Continuous Vision-Language Navigation

Introducing Language Guidance in Prompt-based Continual Learning

DiffIR: Efficient Diffusion Model for Image Restoration

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models

Deformable Neural Radiance Fields using RGB and Event Cameras

Source-free Depth for Object Pop-out

Improving Online Lane Graph Extraction by Object-Lane Clustering

Refine by

Research Collection

Search

Search

Results

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution ﻿

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion ﻿

Self-Supervised Burst Super-Resolution ﻿

Dreamwalker: Mental Planning for Continuous Vision-Language Navigation ﻿

Introducing Language Guidance in Prompt-based Continual Learning ﻿

DiffIR: Efficient Diffusion Model for Image Restoration ﻿

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models ﻿

Deformable Neural Radiance Fields using RGB and Event Cameras ﻿

Source-free Depth for Object Pop-out ﻿

Improving Online Lane Graph Extraction by Object-Lane Clustering ﻿

Refine by

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

Self-Supervised Burst Super-Resolution

Dreamwalker: Mental Planning for Continuous Vision-Language Navigation

Introducing Language Guidance in Prompt-based Continual Learning

DiffIR: Efficient Diffusion Model for Image Restoration

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models

Deformable Neural Radiance Fields using RGB and Event Cameras

Source-free Depth for Object Pop-out

Improving Online Lane Graph Extraction by Object-Lane Clustering