Search
Results
-
Viewpoint-Aware Video Summarization
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionThis paper introduces a novel variant of video summarization, namely building a summary that depends on the particular aspect of a video the viewer focuses on. We refer to this as viewpoint. To infer what the desired viewpoint may be, we assume that several other videos are available, especially groups of videos, e.g., as folders on a person's phone or laptop. The semantic similarity between videos in a group vs. the dissimilarity between ...Conference Paper -
Conditional Probability Models for Deep Image Compression
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionDeep Neural Networks trained as image auto-encoders have recently emerged as a promising direction for advancing the state-of-the-art in image compression. The key challenge in learning such networks is twofold: To deal with quantization, and to control the trade-off between reconstruction error (distortion) and entropy (rate) of the latent image representation. In this paper, we focus on the latter challenge and propose a new technique ...Conference Paper -
Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionThis paper tackles the problem of video object segmentation, given some user annotation which indicates the object of interest. The problem is formulated as pixel-wise retrieval in a learned embedding space: we embed pixels of the same object instance into the vicinity of each other, using a fully convolutional network trained by a modified triplet loss as the embedding model. Then the annotated pixels are set as reference and the rest ...Conference Paper -
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionExploiting synthetic data to learn deep models has attracted increasing attention in recent years. However, the intrinsic domain difference between synthetic and real images usually causes a significant performance drop when applying the learned model to real world scenarios. This is mainly due to two reasons: 1) the model overfits to synthetic images, making the convolutional filters incompetent to extract informative representation for ...Conference Paper -
Decomposing Image Generation into Layout Prediction and Conditional Synthesis
(2020)2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)Learning the distribution of multi-object scenes with Generative Adversarial Networks (GAN) is challenging. Guiding the learning using semantic intermediate representations, which are less complex than images, can be a solution. In this article, we investigate splitting the optimisation of generative adversarial networks into two parts, by first generating a semantic segmentation mask from noise and then translating that segmentation mask ...Conference Paper -
SwinIR: Image Restoration Using Swin Transformer
(2021)2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image ...Conference Paper -
SMILE: Semantically-guided Multi-attribute Image and Layout Editing
(2021)2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs). Exploring the disentangled attribute space within a transformation is a very challenging task due to the multiple and mutually-inclusive nature of the facial images, where different labels (eyeglasses, hats, hair, identity, etc.) can co-exist at the same time. Several works address this issue either by exploiting the ...Conference Paper -
Learning from Simulation, Racing in Reality
(2021)2021 IEEE International Conference on Robotics and Automation (ICRA)We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using a novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive ...Conference Paper -
Appearance-and-Relation Networks for Video Classification
(2018)2018 IEEE/CVF Conference on Computer Vision and Pattern RecognitionConference Paper -
Fast Few-Shot Classification by Few-Iteration Meta-Learning
(2021)2021 IEEE International Conference on Robotics and Automation (ICRA)Conference Paper