Search
Results
-
Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Neural-network-based single image depth prediction (SIDP) is a challenging task where the goal is to predict the scene's per-pixel depth at test time. Since the problem, by definition, is ill-posed, the fundamental goal is to come up with an approach that can reliably model the scene depth from a set of training examples. In the pursuit of perfect depth estimation, most existing state-of-the-art learning techniques predict a single scalar ...Conference Paper -
Knowledge Distillation based Degradation Estimation for Blind Super-Resolution
(2023)The Eleventh International Conference on Learning RepresentationsConference Paper -
LocalViT: Analyzing Locality in Vision Transformers
(2023)2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)The aim of this paper is to study the influence of locality mechanisms in vision transformers. Transformers originated from machine translation and are particularly good at modelling long-range dependencies within a long sequence. Although the global interaction between the token embeddings could be well modelled by the self-attention mechanism of transformers, what is lacking is a locality mechanism for information exchange within a local ...Conference Paper -
Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Guided depth map super-resolution (GDSR), as a hot topic in multi-modal image processing, aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene. The critical step of this task is to effectively extract domain-shared and domain-private RGB/depth features. In addition, three detailed issues, namely blurry edges, noisy surfaces, and over-transferred RGB ...Conference Paper -
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Multi-modality image fusion aims to combine different modalities to produce fused images that retain the complementary features of each modality, such as functional highlights and texture details. To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM). ...Conference Paper -
Source-free Depth for Object Pop-out
(2024)2023 IEEE/CVF International Conference on Computer Vision (ICCV)Depth cues are known to be useful for visual perception. However, direct measurement of depth is often impracticable. Fortunately, though, modern learning-based methods offer promising depth maps by inference in the wild. In this work, we adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D. The "pop-out" is a simple composition prior that assumes objects reside on the background surface. Such ...Conference Paper -
GLU-Net: Global-Local Universal Network for Dense Flow and Correspondences
(2020)Establishing dense correspondences between a pair of images is an important and general problem, covering geometric matching, optical flow and semantic correspondences. While these applications share fundamental challenges, such as large displacements, pixel-accuracy, and appearance changes, they are currently addressed with specialized network architectures, designed for only one particular task. This severely limits the generalization ...Conference Paper