DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

Abstract

We address the problem of learning person-specific facial priors from a small number (e.g., 20) of portrait photos of the same person. This enables us to edit this specific person's facial appearance, such as expression and lighting, while preserving their identity and high-frequency facial details. Key to our approach, which we dub DiffusionRig, is a diffusion model conditioned on, or "rigged by," crude 3D face models estimated from single in-the-wild images by an off-the-shelf estimator. On a high level, DiffusionRig learns to map simplistic renderings of 3D face models to realistic photos of a given person. Specifically, DiffusionRig is trained in two stages: It first learns generic facial priors from a large-scale face dataset and then person-specific priors from a small portrait photo collection of the person of interest. By learning the CGI-to-photo mapping with such personalized priors, DiffusionRig can "rig" the lighting, facial expression, head pose, etc. of a portrait photo, conditioned only on coarse 3D models while preserving this person's identity and other high-frequency characteristics. Qualitative and quantitative experiments show that DiffusionRig outperforms existing approaches in both identity preservation and photorealism.

Personalized Facial Appearance Editing

Expression Change

Lighting Change

Pose Change

Mix and Match:
Physical Buffers and Global Latent Code

We mix the physical buffers from one image and the global latent code from another image to demonstrate how the two conditions encode disentangled information.

Swapping Personalized Models

We demonstrate the power of personalized priors by running one person’s model on other identities. This creates the effect of “adding” one person’s identity to another person.

BibTeX

@InProceedings{ding2023diffusionrig,
      author    = {Zheng Ding, Cecilia Zhang, Zhihao Xia, Lars Jebe, Zhuowen Tu and Xiuming Zhang},
      title     = {DiffusionRig: Learning Personalized Priors for Facial Appearance Editing},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      year      = {2023},
}

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
CVPR 2023

DiffusionRig takes in coarse physical rendering as the condition to “rig” the input image with learned personal priors. The edited images respect the rendering conditions, preserve the identity, and exhibit high-frequency facial details.

Abstract

Personalized Facial Appearance Editing

Expression Change

Lighting Change

Pose Change

Mix and Match:
Physical Buffers and Global Latent Code

We mix the physical buffers from one image and the global latent code from another image to demonstrate how the two conditions encode disentangled information.

Swapping Personalized Models

We demonstrate the power of personalized priors by running one person’s model on other identities. This creates the effect of “adding” one person’s identity to another person.

BibTeX

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing CVPR 2023

DiffusionRig takes in coarse physical rendering as the condition to “rig” the input image with learned personal priors. The edited images respect the rendering conditions, preserve the identity, and exhibit high-frequency facial details.

Abstract

Personalized Facial Appearance Editing

Expression Change

Lighting Change

Pose Change

Mix and Match: Physical Buffers and Global Latent Code

We mix the physical buffers from one image and the global latent code from another image to demonstrate how the two conditions encode disentangled information.

Swapping Personalized Models

We demonstrate the power of personalized priors by running one person’s model on other identities. This creates the effect of “adding” one person’s identity to another person.

BibTeX

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
CVPR 2023

Mix and Match:
Physical Buffers and Global Latent Code