Search

JavaScript is disabled for your browser. Some features of this site may not work without it.

Now showing items 1-10 of 162

Facial Emotion Recognition with Noisy Multi-task Annotations

Zhang, Siwei; Huang, Zhiwu; Paudel, Danda Pani; et al. (2021)

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

Human emotions can be inferred from facial expressions. However, the annotations of facial expressions are often highly noisy in common emotion coding models, including categorical and dimensional ones. To reduce human labelling effort on multi-task labels, we introduce a new problem of facial emotion recognition with noisy multi-task annotations. For this new problem, we suggest a formulation from the point of joint distribution match ...

Conference Paper

SRFlow: Learning the Super-Resolution Space with Normalizing Flow

Lugmayr, Andreas; Danelljan, Martin; Van Gool, Luc; et al. (2020)

Lecture Notes in Computer Science ~ Computer Vision – ECCV 2020 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V

Conference Paper

GANmut: Learning Interpretable Conditional Space for Gamut of Emotions

d'Apolito, Stefano; Paudel, Danda Pani; Huang, Zhiwu; et al. (2021)

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Humans can communicate emotions through a plethora of facial expressions, each with its own intensity, nuances and ambiguities. The generation of such variety by means of conditional GANs is limited to the expressions encoded in the used label system. These limitations are caused either due to burdensome labelling demand or the confounded label space. On the other hand, learning from inexpensive and intuitive basic categorical emotion ...

Conference Paper

3D CNNs with Adaptive Temporal Feature Resolutions

Fayyaz, Mohsen; Bahrami, Emad; Diba, Ali; et al. (2021)

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips. In this work, we therefore introduce a differentiable Similarity Guided Sampling (SGS) ...

Conference Paper

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

Li, Yawei; Li, Wen; Danelljan, Martin; et al. (2021)

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we tackle the problem of convolutional neural network design. Instead of focusing on the design of the overall architecture, we investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Based on that, we articulate the "heterogeneity hypothesis": ...

Conference Paper

Depth Estimation from Monocular Images and Sparse Radar Data

Lin, Juan-Ting; Dai, Dengxin; Van Gool, Luc (2020)

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network. We give a comprehensive study of the fusion between RGB images and Radar measurements from different aspects and proposed a working solution based on the observations. We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from ...

Conference Paper

Learning Accurate and Human-Like Driving using Semantic Maps and Attention

Hecker, Simon; Dai, Dengxin; Liniger, Alexander; et al. (2020)

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

This paper investigates how end-to-end driving models can be improved to drive more accurately and human-like. To tackle the first issue we exploit semantic and visual maps from HERE Technologies and augment the existing Drive360 dataset with such. The maps are used in an attention mechanism that promotes segmentation confidence masks, thus focusing the network on semantic classes in the image that are important for the current driving ...

Conference Paper

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

Zhou, Tianfei; Wang, Wenguan; Liu, Si; et al. (2021)

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. It is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection ...

Conference Paper

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

Kaya, Berk; Kumar, Suryansh; Porto de Oliveira, Carlos Eduardo; et al. (2021)

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

This paper presents an uncalibrated deep neural network framework for the photometric stereo problem. For training models to solve the problem, existing neural network-based methods either require exact light directions or ground-truth surface normals of the object or both. However, in practice, it is challenging to procure both of this information precisely, which restricts the broader adoption of photometric stereo algorithms for vision ...

Conference Paper

VisDrone-MOT2021: The Vision Meets Drone Multiple Object Tracking Challenge Results

Chen, Guanlin; Wang, Wenguan; He, Zhijian; et al. (2021)

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Vision Meets Drone: Multiple Object Tracking (VisDrone-MOT2021) challenge - the forth annual activity organized by the VisDrone team - focuses on benchmarking UAV MOT algorithms in realistic challenging environments. It is held in conjunction with ICCV 2021. VisDrone-MOT2021 contains 96 video sequences in total, including 56 sequences (similar to 24K frames) for training, 7 sequences (similar to 3K frames) for validation and 33 sequences ...

Conference Paper

Research Collection

Search

Results

Facial Emotion Recognition with Noisy Multi-task Annotations ﻿

SRFlow: Learning the Super-Resolution Space with Normalizing Flow ﻿

GANmut: Learning Interpretable Conditional Space for Gamut of Emotions ﻿

3D CNNs with Adaptive Temporal Feature Resolutions ﻿

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures ﻿

Depth Estimation from Monocular Images and Sparse Radar Data ﻿

Learning Accurate and Human-Like Driving using Semantic Maps and Attention ﻿

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing ﻿

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces ﻿

VisDrone-MOT2021: The Vision Meets Drone Multiple Object Tracking Challenge Results ﻿

Refine by

Facial Emotion Recognition with Noisy Multi-task Annotations

SRFlow: Learning the Super-Resolution Space with Normalizing Flow

GANmut: Learning Interpretable Conditional Space for Gamut of Emotions

3D CNNs with Adaptive Temporal Feature Resolutions

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

Depth Estimation from Monocular Images and Sparse Radar Data

Learning Accurate and Human-Like Driving using Semantic Maps and Attention

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

VisDrone-MOT2021: The Vision Meets Drone Multiple Object Tracking Challenge Results