Search
Results
-
SMILE: Semantically-guided Multi-attribute Image and Layout Editing
(2021)2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs). Exploring the disentangled attribute space within a transformation is a very challenging task due to the multiple and mutually-inclusive nature of the facial images, where different labels (eyeglasses, hats, hair, identity, etc.) can co-exist at the same time. Several works address this issue either by exploiting the ...Conference Paper -
Safe Motion Planning for Autonomous Driving using an Adversarial Road Model
(2020)Proceedings of Robotics: Science and Systems XVIThis paper presents a game-theoretic path-following formulation where the opponent is an adversary road model. This formulation allows us to compute safe sets using tools from viability theory, that can be used as terminal constraints in an optimization-based motion planner. Based on the adversary road model, we first derive an analytical discriminating domain, which even allows guaranteeing safety in the case when steering rate constraints ...Conference Paper -
Go with the Flows: Mixtures of Normalizing Flows for Point Cloud Generation and Reconstruction
(2021)2021 International Conference on 3D Vision (3DV)Recently Normalizing Flows (NFs) have demonstrated state-of-the-art performance on modeling 3D point clouds while allowing sampling with arbitrary resolution at inference time. However, these flow-based models still have fundamental limitations on complicated geometries. This work generalizes prior work by introducing additional discrete latent variable, i.e. mixture model. This circumvents limitations of prior approaches, leads to more ...Conference Paper -
Unsupervised Compound Domain Adaptation for Face Anti-Spoofing
(2021)2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)We address the problem of face anti-spoofing which aims to make the face verification systems robust in the real world settings. The context of detecting live vs. spoofed face images may differ significantly in the target domain, when compared to that of labeled source domain where the model is trained. Such difference may be caused due to new and unknown spoof types, illumination conditions, scene backgrounds, among many others. These ...Conference Paper -
Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionThis paper tackles the problem of video object segmentation, given some user annotation which indicates the object of interest. The problem is formulated as pixel-wise retrieval in a learned embedding space: we embed pixels of the same object instance into the vicinity of each other, using a fully convolutional network trained by a modified triplet loss as the embedding model. Then the annotated pixels are set as reference and the rest ...Conference Paper -
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionExploiting synthetic data to learn deep models has attracted increasing attention in recent years. However, the intrinsic domain difference between synthetic and real images usually causes a significant performance drop when applying the learned model to real world scenarios. This is mainly due to two reasons: 1) the model overfits to synthetic images, making the convolutional filters incompetent to extract informative representation for ...Conference Paper -
-
Appearance-and-Relation Networks for Video Classification
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionConference Paper -
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
(2021)2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks. For this and other video understanding tasks, supervised approaches have achieved encouraging performance but require a high volume of detailed frame-level annotations. We present a fully automatic and unsupervised approach for segmenting actions in a video that does ...Conference Paper -
CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences
(2021)2021 IEEE Winter Conference on Applications of Computer Vision (WACV)While ground truth depth data remains hard to obtain, self-supervised monocular depth estimation methods enjoy growing attention. Much research in this area aims at improving loss functions or network architectures. Most works, however, do not leverage self-supervision to its full potential. They stick to the standard closed world train-test pipeline, assuming the network parameters to be fixed after the training is finished. Such an ...Conference Paper