Fast and Resolution-AgnosticDenoising Diffusion for Visual Content Generation and Computer Vision
OPEN ACCESS
Loading...
Author / Producer
Date
2024
Publication Type
Master Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have recently emerged as a transformative technology due to their ability to generate naturally-looking images with unprecedented quality and diversity. However, limitations still prevent them from being adapted to more applications than pure image generation. These models often generate images with fixed resolution and are not directly applicable to canvases of flexible sizes. Moreover, the iterative denoising nature of diffusion models also burdens computation resources and inference time. In this thesis, we investigate methods that (1) allow diffusion models to generate resolution-agnostic generation for canvases of varying sizes and even surfaces in 3D without expensive re-training and (2) speed up the generation with diffusion models by enabling one or a few-step generation. We demonstrate that these techniques allow diffusion models to be better incorporated in various computer vision tasks. For large-scale data generation for domain generalization, we develop a Multi-Resolution Latent Fusion technique that overcomes the limitation of latent diffusion models to generate semantically coherent small instances, improving downstream performance. For creative mesh painting, we develop a novel pipeline to combine one- and few-step Latent Consistency Models with multi-view content fusion, achieving view-consistent painting with good generation quality and fast speeds. Finally, we distill a state-of-the-art monocular depth estimation diffusion model for monocular depth estimation to enable one-step estimation, greatly improving runtime while not compromising on prediction quality.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Schindler, Konrad
Examiner : Obukhov, Anton
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
03886 - Schindler, Konrad / Schindler, Konrad