Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenging problem of reconstructing high-fidelity, renderable 3D facial models from only a few uncalibrated face images. We propose the first co-optimization framework integrating Gaussian splatting with explicit triangular meshes. Methodologically, we introduce semantic-segmentation-guided geometric alignment and soft mesh constraints to ensure accurate neutral-pose modeling; design a view-dependent mapping from Gaussian points to texture space for generating 4K neural textures; and achieve illumination-decoupled albedo extraction with cross-illumination robust training. Our contributions are threefold: (1) high-quality meshes and textures are generated from Gaussian representations without modifying standard graphics pipelines; (2) fine-grained, animation- and relighting-ready facial assets are produced from merely 11 input images; and (3) strong generalization and practical utility are demonstrated in text-driven 3D face generation tasks.

Technology Category

Application Category

📝 Abstract
We leverage increasingly popular three-dimensional neural representations in order to construct a unified and consistent explanation of a collection of uncalibrated images of the human face. Our approach utilizes Gaussian Splatting, since it is more explicit and thus more amenable to constraints than NeRFs. We leverage segmentation annotations to align the semantic regions of the face, facilitating the reconstruction of a neutral pose from only 11 images (as opposed to requiring a long video). We soft constrain the Gaussians to an underlying triangulated surface in order to provide a more structured Gaussian Splat reconstruction, which in turn informs subsequent perturbations to increase the accuracy of the underlying triangulated surface. The resulting triangulated surface can then be used in a standard graphics pipeline. In addition, and perhaps most impactful, we show how accurate geometry enables the Gaussian Splats to be transformed into texture space where they can be treated as a view-dependent neural texture. This allows one to use high visual fidelity Gaussian Splatting on any asset in a scene without the need to modify any other asset or any other aspect (geometry, lighting, renderer, etc.) of the graphics pipeline. We utilize a relightable Gaussian model to disentangle texture from lighting in order to obtain a delit high-resolution albedo texture that is also readily usable in a standard graphics pipeline. The flexibility of our system allows for training with disparate images, even with incompatible lighting, facilitating robust regularization. Finally, we demonstrate the efficacy of our approach by illustrating its use in a text-driven asset creation pipeline.
Problem

Research questions and friction points this paper is trying to address.

Reconstructs 3D facial geometry from few uncalibrated images
Converts Gaussian Splats into view-dependent neural textures
Enables high-fidelity facial assets in standard graphics pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splatting reconstructs facial geometry from few images
Semantic segmentation aligns face regions for neutral pose reconstruction
Relightable model extracts lighting-independent textures for standard pipelines
🔎 Similar Papers
No similar papers found.