Qin Wang
Loading...
18 results
Search Results
Publications1 - 10 of 18
- CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-ResolutionItem type: Conference Paper
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Cao, Jiezhang; Wang, Qin; Xian, Yongqin; et al. (2023)Learning continuous image representations is recently gaining popularity for image super-resolution (SR) because of its ability to reconstruct high-resolution images with arbitrary scales from low-resolution inputs. Existing methods mostly ensemble nearby features to predict the new pixel at any queried coordinate in the SR image. Such a local ensemble suffers from some limitations: i) it has no learnable parameters and it neglects the similarity of the visual features; ii) it has a limited receptive field and cannot ensemble relevant features in a large field which are important in an image. To address these issues, this paper proposes a continuous implicit attention-in-attention network, called CiaoSR. We explicitly design an implicit attention network to learn the ensemble weights for the nearby local features. Furthermore, we embed a scale-aware attention in this implicit attention network to exploit additional non-local information. Extensive experiments on benchmark datasets demonstrate CiaoSR significantly outperforms the existing single image SR methods with the same backbone. In addition, CiaoSR also achieves the state-of-the-art performance on the arbitrary-scale SR task. The effectiveness of the method is also demonstrated on the real-world SR setting. More importantly, CiaoSR can be flexibly integrated into any backbone to improve the SR performance. - DARE-GRAM : Unsupervised Domain Adaptation Regression by Aligning Inverse Gram MatricesItem type: Conference Paper
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Nejjar, Ismail; Wang, Qin; Fink, Olga (2023)Unsupervised Domain Adaptation Regression (DAR) aims to bridge the domain gap between a labeled source dataset and an unlabelled target dataset for regression problems. Recent works mostly focus on learning a deep feature encoder by minimizing the discrepancy between source and target features. In this work, we present a different perspective for the DAR problem by analyzing the closed-form ordinary least square (OLS) solution to the linear regressor in the deep domain adaptation context. Rather than aligning the original feature embedding space, we propose to align the inverse Gram matrix of the features, which is motivated by its presence in the OLS solution and the Gram matrix's ability to capture the feature correlations. Specifically, we propose a simple yet effective DAR method which leverages the pseudo-inverse low-rank property to align the scale and angle in a selected sub-space generated by the pseudo-inverse Gram matrix of the two domains. We evaluate our method on three domain adaptation regression benchmarks. Experimental results demonstrate that our method achieves state-of-the-art performance. Our code is available at https://github.com/ismailnejjar/DARE-GRAM. - Towards Interpretable Video Super-Resolution via Alternating OptimizationItem type: Conference Paper
Lecture Notes in Computer Science ~ Computer Vision – ECCV 2022Cao, Jiezhang; Liang, Jingyun; Zhang, Kai; et al. (2022)In this paper, we study a practical space-time video super-resolution (STVSR) problem which aims at generating a high-framerate high-resolution sharp video from a low-framerate low-resolution blurry video. Such problem often occurs when recording a fast dynamic event with a low-framerate and low-resolution camera, and the captured video would suffer from three typical issues: i) motion blur occurs due to object/camera motions during exposure time; ii) motion aliasing is unavoidable when the event temporal frequency exceeds the Nyquist limit of temporal sampling; iii) high-frequency details are lost because of the low spatial sampling rate. These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences. To address this, we propose an interpretable STVSR framework by leveraging both model-based and learning-based methods. Specifically, we formulate STVSR as a joint video deblurring, frame interpolation, and super-resolution problem, and solve it as two sub-problems in an alternate way. For the first sub-problem, we derive an interpretable analytical solution and use it as a Fourier data transform layer. Then, we propose a recurrent video enhancement layer for the second sub-problem to further recover high-frequency details. Extensive experiments demonstrate the superiority of our method in terms of quantitative metrics and visual quality. - Towards Real-World Domain Adaptation with Deep LearningItem type: Doctoral ThesisWang, Qin (2021)
- Potential, challenges and future directions for deep learning in prognostics and health management applicationsItem type: Journal Article
Engineering Applications of Artificial IntelligenceFink, Olga; Wang, Qin; Svensén, Markus; et al. (2020)Deep learning applications have been thriving over the last decade in many different domains, including computer vision and natural language understanding. The drivers for the vibrant development of deep learning have been the availability of abundant data, breakthroughs of algorithms and the advancements in hardware. Despite the fact that complex industrial assets have been extensively monitored and large amounts of condition monitoring signals have been collected, the application of deep learning approaches for detecting, diagnosing and predicting faults of complex industrial assets has been limited. The current paper provides a thorough evaluation of the current developments, drivers, challenges, potential solutions and future research needs in the field of deep learning applied to Prognostics and Health Management (PHM) applications. - Missing-Class-Robust Domain Adaptation by Unilateral AlignmentItem type: Journal Article
IEEE Transactions on Industrial ElectronicsWang, Qin; Michau, Gabriel; Fink, Olga (2020) - Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation with Implicit Neural RepresentationsItem type: Conference Paper
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Gong, Rui; Wang, Qin; Danelljan, Martin; et al. (2023)Unsupervised domain adaptation (UDA) for semantic segmentation aims at improving the model performance on the unlabeled target domain by leveraging a labeled source domain. Existing approaches have achieved impressive progress by utilizing pseudo-labels on the unlabeled target-domain images. Yet the low-quality pseudo-labels, arising from the domain discrepancy, inevitably hinder the adaptation. This calls for effective and accurate approaches to estimating the reliability of the pseudo-labels, in order to rectify them. In this paper, we propose to estimate the rectification values of the predicted pseudo-labels with implicit neural representations. We view the rectification value as a signal defined over the continuous spatial domain. Taking an image coordinate and the nearby deep features as inputs, the rectification value at a given coordinate is predicted as an output. This allows us to achieve high-resolution and detailed rectification values estimation, important for accurate pseudo-label generation at mask boundaries in particular. The rectified pseudo-labels are then leveraged in our rectification-aware mixture model (RMM) to be learned end-to-end and help the adaptation. We demonstrate the effectiveness of our approach on different UDA benchmarks, including synthetic-to-real and day-to-night. Our approach achieves superior results compared to state-of-the-art. The implementation is available at https://github.com/ETHRuiGong/IR2F. - Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture SearchItem type: Conference Paper
Lecture Notes in Computer Science ~ Computer Vision – ECCV 2020Tian, Yuan; Wang, Qin; Huang, Zhiwu; et al. (2020)In this paper, we introduce a new reinforcement learning (RL) based neural architecture search (NAS) methodology for effective and efficient generative adversarial network (GAN) architecture search. The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling, which enables a more effective RL-based search algorithm by targeting the potential global optimal architecture. To improve efficiency, we exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies. Evaluation on two standard benchmark datasets (i.e., CIFAR-10 and STL-10) demonstrates that the proposed method is able to discover highly competitive architectures for generally better image generation results with a considerably reduced computational burden: 7 GPU hours. Our code is available at https://github.com/Yuantian013/E2GAN. - Domain Adaptive Semantic Segmentation with Self-Supervised Depth EstimationItem type: Conference Paper
2021 IEEE/CVF International Conference on Computer Vision (ICCV)Wang, Qin; Dai, Dengxin; Hoyer, Lukas; et al. (2021)omain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain. Leveraging the supervision from auxiliary tasks (such as depth estimation) has the potential to heal this shift because many visual tasks are closely related to each other. However, such a supervision is not always available. In this work, we leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap. On the one hand, we propose to explicitly learn the task feature correlation to strengthen the target semantic predictions with the help of target depth estimation. On the other hand, we use the depth prediction discrepancy from source and target depth decoders to approximate the pixel-wise adaptation difficulty. The adaptation difficulty, inferred from depth, is then used to refine the target semantic segmentation pseudo-labels. The proposed method can be easily implemented into existing segmentation frameworks. We demonstrate the effectiveness of our approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes, on which we achieve the new state-of-the-art performance of 55.0% and 56.6%, respectively. Our code is available at https://qin.ee/corda. - ContextVP: Fully Context-Aware Video PredictionItem type: Conference Paper
Lecture Notes in Computer Science ~ Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVIByeon, Wonmin; Wang, Qin; Kumar Srivastava, Rupesh; et al. (2018)
Publications1 - 10 of 18