Skip to yearly menu bar Skip to main content


Oral

Diffusion-FOF: Single-View Clothed Human Reconstruction via Diffusion-Based Fourier Occupancy Field

Yuanzhen Li · Fei LUO · Chunxia Xiao

Summit Ballroom Oral #4
[ ] [ Visit Orals 3A 3D from single view ]
Thu 20 Jun 9:54 a.m. — 10:12 a.m. PDT

Abstract:

Fourier occupancy field-based human reconstruction is a simple method that transforms the occupancy function of the 3D model into a multichannel 2D vector field. However, accurately estimating high-frequency information of the FOF is challenging, leading to geometric distortion and discontinuity. To this end, we propose a wavelet-based diffusion model to predict the FOF, extracting more high-frequency information and enhancing geometric stability. Our method comprises two interconnected tasks: texture estimation and geometry prediction. Initially, we predict the back-side texture from the input image, incorporating a style consistency constraint between the predicted back-side image and the original input image. To enhance network training effectiveness, we adopt a Siamese network training strategy. We introduce a wavelet-based diffusion model for geometric estimation to generate the Fourier occupancy field. First, we utilize an image encoder module to extract the features of the two images as conditions. Subsequently, we employ a conditional diffusion model to estimate the Fourier occupancy field in the wavelet domain. The predicted wavelet coefficients are then converted into the Fourier occupancy field using the inverse wavelet transform (IWT). A refinement network refines the predicted Fourier occupancy field with image features as guidance, yielding the final output. Through both quantitative and qualitative experiments, we demonstrate the state-of-the-art performance of our method in reconstructing single-view clothed human subjects.

Chat is not available.