SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion
METADATA ONLY
Loading...
Author / Producer
Date
2024
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
A long-standing goal of 3D human reconstruction is to create lifelike and fully detailed 3D humans from single-view images. The main challenge lies in inferring unknown body shapes, appearances, and clothing details in areas not visible in the images. To address this, we propose SiTH, a novel pipeline that uniquely integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow. At the core of our method lies the decomposition of the challenging single-view reconstruction problem into generative hallucination and reconstruction subproblems. For the former, we employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images. For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images. SiTH requires as few as 500 3D human scans for training while maintaining its generality and robustness to diverse images. Extensive evaluations on two 3D human benchmarks, including our newly created one, highlighted our method's superior accuracy and perceptual quality in 3D textured human reconstruction.
Permanent link
Publication status
published
External links
Editor
Book title
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Journal / series
Volume
Pages / Article No.
538 - 549
Publisher
IEEE
Event
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Digital Humans; 3D Reconstruction; Texture Generation; Generative Modeling; Diffusion Models; Benchmark; Single-view Reconstruction; 3D Content Creation
Organisational unit
03979 - Hilliges, Otmar (ehemalig) / Hilliges, Otmar (former)