Journal: IEEE Transactions on Image Processing
Loading...
Abbreviation
IEEE Trans. Image Process.
Publisher
IEEE
49 results
Search Results
Publications 1 - 10 of 49
- Image-dependent gamut mapping as optimization problemItem type: Journal Article
IEEE Transactions on Image ProcessingGiesen, Joachim; Schuberth, Eva; Simon, Klaus; et al. (2007) - DeepQTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVCItem type: Journal Article
IEEE Transactions on Image ProcessingLi, Tianyi; Xu, Mai; Tang, Runzhi; et al. (2021)Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its predecessor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of the coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization. Instead of the brute-force QTMT search, this paper proposes a deep learning approach to predict the QTMT-based CU partition, for drastically accelerating the encoding process of intra-mode VVC. First, we establish a large-scale database containing sufficient CU partition patterns with diverse video content, which can facilitate the data-driven VVC complexity reduction. Next, we propose a multi-stage exit CNN (MSE-CNN) model with an early-exit mechanism to determine the CU partition, in accord with the flexible QTMT structure at multiple stages. Then, we design an adaptive loss function for training the MSE-CNN model, synthesizing both the uncertain number of split modes and the target on minimized RD cost. Finally, a multi-threshold decision scheme is developed, achieving a desirable trade-off between complexity and RD performance. The experimental results demonstrate that our approach can reduce the encoding time of VVC by 44.65%~66.88% with a negligible Bjøntegaard delta bit-rate (BD-BR) of 1.322%~3.188%, significantly outperforming other state-of-the-art approaches. © 2021 IEEE - Distinguishing Texture Edges from Object Boundaries in VideoItem type: Journal Article
IEEE Transactions on Image ProcessingWang, O.; Dumcke, M.; Smolic, A.; et al. (2013) - Layout-to-Image Translation With Double Pooling Generative Adversarial NetworksItem type: Journal Article
IEEE Transactions on Image ProcessingTang, Hao; Sebe, Nicu (2021)In this paper, we address the task of layout-to-image translation, which aims to translate an input semantic layout to a realistic image. One open challenge widely observed in existing methods is the lack of effective semantic constraints during the image translation process, leading to models that cannot preserve the semantic information and ignore the semantic dependencies within the same object. To address this issue, we propose a novel Double Pooling GAN (DPGAN) for generating photo-realistic and semantically-consistent results from the input layout. We also propose a novel Double Pooling Module (DPM), which consists of the Square-shape Pooling Module (SPM) and the Rectangle-shape Pooling Module (RPM). Specifically, SPM aims to capture short-range semantic dependencies of the input layout with different spatial scales, while RPM aims to capture long-range semantic dependencies from both horizontal and vertical directions. We then effectively fuse both outputs of SPM and RPM to further enlarge the receptive field of our generator. Extensive experiments on five popular datasets show that the proposed DPGAN achieves better results than state-of-the-art methods. Finally, both SPM and SPM are general and can be seamlessly integrated into any GAN-based architectures to strengthen the feature representation. The code is available at https://github.com/Ha0Tang/DPGAN. - A Bayesian Framework for the Analog Reconstruction of Kymographs from Fluorescence Microscopy DataItem type: Journal Article
IEEE Transactions on Image ProcessingSamuylov, Denis K.; Székely, Gábor; Paul, Grégory (2019) - Group-Wise Learning for Weakly Supervised Semantic SegmentationItem type: Journal Article
IEEE Transactions on Image ProcessingZhou, Tianfei; Li, Liulei; Li, Xueyi; et al. (2022)Acquiring sufficient ground-truth supervision to train deep visual models has been a bottleneck over the years due to the data-hungry nature of deep learning. This is exacerbated in some structured prediction tasks, such as semantic segmentation, which require pixel-level annotations. This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. To achieve this, we propose, for the first time, a novel group-wise learning framework for WSSS. The framework explicitly encodes semantic dependencies in a group of images to discover rich semantic context for estimating more reliable pseudo ground-truths, which are subsequently employed to train more effective segmentation models. In particular, we solve the group-wise learning within a graph neural network (GNN), wherein input images are represented as graph nodes, and the underlying relations between a pair of images are characterized by graph edges. We then formulate semantic mining as an iterative reasoning process which propagates the common semantics shared by a group of images to enrich node representations. Moreover, in order to prevent the model from paying excessive attention to common semantics, we further propose a graph dropout layer to encourage the graph model to capture more accurate and complete object responses. With the above efforts, our model lays the foundation for more sophisticated and flexible group-wise semantic mining. We conduct comprehensive experiments on the popular PASCAL VOC 2012 and COCO benchmarks, and our model yields state-of-the-art performance. In addition, our model shows promising performance in weakly supervised object localization (WSOL) on the CUB-200-2011 dataset, demonstrating strong generalizability. Our code is available at: https://github.com/Lixy1997/Group-WSSS. - Automatic View Synthesis by Image-Domain-WarpingItem type: Journal Article
IEEE Transactions on Image ProcessingStefanoski, Nikolce; Wang, Oliver; Lang, Manuel; et al. (2013) - Unsupervised High-Resolution Portrait Gaze Correction and AnimationItem type: Journal Article
IEEE Transactions on Image ProcessingZhang, Jichao; Chen, Jingjing; Tang, Hao; et al. (2022)This paper proposes a gaze correction and animation method for high-resolution, unconstrained portrait images, which can be trained without the gaze angle and the head pose annotations. Common gaze-correction methods usually require annotating training data with precise gaze, and head pose information. Solving this problem using an unsupervised method remains an open problem, especially for high-resolution face images in the wild, which are not easy to annotate with gaze and head pose labels. To address this issue, we first create two new portrait datasets: CelebGaze (256 x 256) and high-resolution CelebHQGaze (512 x 512). Second, we formulate the gaze correction task as an image inpainting problem, addressed using a Gaze Correction Module (GCM) and a Gaze Animation Module (GAM). Moreover, we propose an unsupervised training strategy, i.e., Synthesis-As-Training, to learn the correlation between the eye region features and the gaze angle. As a result, we can use the learned latent space for gaze animation with semantic interpolation in this space. Moreover, to alleviate both the memory and the computational costs in the training and the inference stage, we propose a Coarse-to-Fine Module (CFM) integrated with GCM and GAM. Extensive experiments validate the effectiveness of our method for both the gaze correction and the gaze animation tasks in both low and high-resolution face datasets in the wild and demonstrate the superiority of our method with respect to the state of the art. - Constant Time Joint Bilateral Filtering Using Joint Integral HistogramsItem type: Journal Article
IEEE Transactions on Image ProcessingZhang, K.; Lafruit, G.; Lauwereins, R.; et al. (2012) - Real-Time Action Recognition with Deeply-Transferred Motion Vector CNNsItem type: Journal Article
IEEE Transactions on Image ProcessingZhang, Bowen; Wang, Limin; Wang, Zhe; et al. (2018)
Publications 1 - 10 of 49