Search

JavaScript is disabled for your browser. Some features of this site may not work without it.

Now showing items 11-20 of 648

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation

Saha, Suman; Hoyer, Lukas; Obukhov, Anton; et al. (2024)

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

With autonomous industries on the rise, domain adaptation of the visual perception stack is an important research direction due to the cost savings promise. Much prior art was dedicated to domain-adaptive semantic segmentation in the synthetic-to-real context. Despite being a crucial output of the perception stack, panoptic segmentation has been largely overlooked by the domain adaptation community. Therefore, we revisit well-performing ...

Conference Paper

MultiVT: Multiple-Task Framework for Dentistry

Mello Rella, Edoardo; Chhatkuli, Ajad; Konukoglu, Ender; et al. (2024)

Lecture Notes in Computer Science ~ Domain Adaptation and Representation Transfer

Current image understanding methods on dental data are often trained end-to-end on inputs and labels, with focus on using state-of-the-art neural architectures. Such approaches, however, typically ignore domain specific peculiarities and lack the ability to generalize outside their training dataset. We observe that, in RGB images, teeth display a weak or unremarkable texture while exhibiting strong boundaries; similarly, in panoramic ...

Conference Paper

Replay-Based Online Adaptation for Unsupervised Deep Visual Odometry

Kuznietsov, Yevhen; Proesmans, Marc; Van Gool, Luc (2024)

Lecture Notes in Computer Science ~ Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

Online adaptation is a promising paradigm that enables dynamic adaptation to new environments. In recent years, there has been a growing interest in exploring online adaptation for various problems, including visual odometry, a crucial task in robotics, autonomous systems, and driver assistance applications. In this work, we leverage experience replay, a potent technique for enhancing online adaptation, to explore the replay-based online ...

Conference Paper

LocalViT: Analyzing Locality in Vision Transformers

Li, Yawei; Zhang, Kai; Cao, Jiezhang; et al. (2023)

2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

The aim of this paper is to study the influence of locality mechanisms in vision transformers. Transformers originated from machine translation and are particularly good at modelling long-range dependencies within a long sequence. Although the global interaction between the token embeddings could be well modelled by the self-attention mechanism of transformers, what is lacking is a locality mechanism for information exchange within a local ...

Conference Paper

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

Popovic, Nikola; Christodoulou, Dimitrios; Paudel, Danda Pani; et al. (2023)

2023 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)

The task of predicting 3D eye gaze from eye images can be performed either by (a) end-to-end learning for image-to-gaze mapping or by (b) fitting a 3D eye model onto images. The former case requires 3D gaze labels, while the latter requires eye semantics or landmarks to facilitate the model fitting. Although obtaining eye semantics and landmarks is relatively easy, fitting an accurate 3D eye model on them remains to be very challenging ...

Conference Paper

Optimizing Long-Term Player Tracking and Identification in NAO Robot Soccer by fusing Game-state and External Video

Albanese, Giuliano; Mitra, Arka; Zaech, Jan-Nico; et al. (2023)

RoboLetics: Workshop on Robot Learning in Athletics @CoRL 2023

Monitoring a fleet of robots requires stable long-term tracking with re-identification, which is yet an unsolved challenge in many scenarios. One application of this is the analysis of autonomous robotic soccer games at RoboCup. Tracking in these games requires handling of identically looking players, strong occlusions, and non-professional video recordings, but also offers state information estimated by the robots. In order to make ...

Conference Paper

Multi-Domain Referee Dataset: Enabling Recognition of Referee Signals on Robotic Platforms

Mitra, Arka; Molnar, Lukas; Zaech, Jan-Nico; et al. (2023)

RoboLetics: Workshop on Robot Learning in Athletics @CoRL 2023

Recognizing referee signals is crucial in human and RoboCup soccer games, where an emphasis currently lies on full robot autonomy through understanding referee signals. To advance towards this goal, we introduce the Multi-Domain Referee Dataset aimed at high-efficiency action recognition in RoboCup and examine the transfer between simulated and real domains in strongly structured settings. Our dataset includes 3,108 action sequences across ...

Conference Paper

Token-consistent Dropout for Calibrated Vision Transformers

Popovic, Nikola; Paudel, Danda Pani; Probst, Thomas; et al. (2023)

2023 IEEE International Conference on Image Processing (ICIP)

We introduce token-consistent dropout in vision transformers, which improves network calibration without causing any severe drop in performance. We use linear layers with token-consistent stochastic parameters inside the multilayer perceptron blocks, without altering the architecture of the transformer. The stochastic parameters are sampled from the uniform distribution, both during training and inference. The applied linear operations ...

Conference Paper

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

Liu, Ce; Kumar, Suryansh; Gu, Shuhang; et al. (2023)

We introduce VA-DepthNet, a simple, effective, and accurate deep neural network approach for the single-image depth prediction (SIDP) problem. The proposed approach advocates using classical first-order variational constraints for this problem. While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene ...

Conference Paper

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

Cao, Jiezhang; Wang, Qin; Xian, Yongqin; et al. (2023)

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Learning continuous image representations is recently gaining popularity for image super-resolution (SR) because of its ability to reconstruct high-resolution images with arbitrary scales from low-resolution inputs. Existing methods mostly ensemble nearby features to predict the new pixel at any queried coordinate in the SR image. Such a local ensemble suffers from some limitations: i) it has no learnable parameters and it neglects the ...

Conference Paper

Results

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation

MultiVT: Multiple-Task Framework for Dentistry

Replay-Based Online Adaptation for Unsupervised Deep Visual Odometry

LocalViT: Analyzing Locality in Vision Transformers

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

Optimizing Long-Term Player Tracking and Identification in NAO Robot Soccer by fusing Game-state and External Video

Multi-Domain Referee Dataset: Enabling Recognition of Referee Signals on Robotic Platforms

Token-consistent Dropout for Calibrated Vision Transformers

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

Refine by

Research Collection

Search

Search

Results

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation ﻿

MultiVT: Multiple-Task Framework for Dentistry ﻿

Replay-Based Online Adaptation for Unsupervised Deep Visual Odometry ﻿

LocalViT: Analyzing Locality in Vision Transformers ﻿

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions ﻿

Optimizing Long-Term Player Tracking and Identification in NAO Robot Soccer by fusing Game-state and External Video ﻿

Multi-Domain Referee Dataset: Enabling Recognition of Referee Signals on Robotic Platforms ﻿

Token-consistent Dropout for Calibrated Vision Transformers ﻿

VA-DepthNet: A Variational Approach to Single Image Depth Prediction ﻿

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution ﻿

Refine by

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation

MultiVT: Multiple-Task Framework for Dentistry

Replay-Based Online Adaptation for Unsupervised Deep Visual Odometry

LocalViT: Analyzing Locality in Vision Transformers

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

Optimizing Long-Term Player Tracking and Identification in NAO Robot Soccer by fusing Game-state and External Video

Multi-Domain Referee Dataset: Enabling Recognition of Referee Signals on Robotic Platforms

Token-consistent Dropout for Calibrated Vision Transformers

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution