“Revolutionize Video Face Swapping with Robust, Diffusion-Powered, Temporal Consistency-Driven Innovation!”
This repository contains code for the paper VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping.
We propose a diffusion-based framework for video face swapping, featuring hybrid training, an AIDT dataset, and 3D reconstruction for superior identity preservation and temporal consistency.
🌐 Project Page | 🤗 Hugging Face Models
- 2025-10-15: 🔓 Code and pre-trained weights released!
- 2025-09-19: 🎉 Our paper VividFace has been accepted to NeurIPS 2025!
-
Create and activate the Conda environment:
conda create --name vividface python=3.8 conda activate vividface
-
Install the required dependencies:
pip install -r requirements.txt
-
Install the dependency for Deep3DFaceRecon_pytorch:
cd Deep3DFaceRecon/nvdiffrast pip install .
Download the required models and place them in the correct directories:
| Model | Source | Destination |
|---|---|---|
| VividFace Weights | Hugging Face | weights/ |
| BFM Model | Included in VividFace weights | Deep3DFaceRecon/BFM/ |
| Stable Diffusion v1.5 | Hugging Face | weights/stable-diffusion-v1-5/ |
We have prepared some sample files in the examples/ folder for testing. Use the following command to run a test:
python infer.py examplesThis will sequentially replace faces in the videos located in examples/videos/ with faces from examples/faces/ (e.g., the first video in examples/videos/ will have its face replaced by the first face in examples/faces/).
After execution, you can find the output results in the outputs/ directory.
If you want to test your own data, follow the format of the examples/ folder:
- Each video must have a corresponding
.txtfile with the same name. - The
.txtfile should have the same number of lines as the number of frames in the video. - Each line must contain 14 values:
- The first 4 values represent the face bounding box (bbox).
- The next 10 values represent 5 facial keypoints.
- Ensure that faces are cropped properly. We recommend using insightface for face cropping.
We provide training scripts, but the dataset is not yet available for public use. Therefore, training cannot be executed at this time. However, you can explore the code by running:
bash run.shTo train our Face3DVAE, navigate to the face3dvae directory and run the training script:
cd face3dvae
bash train.shIf you find our work helpful, please cite:
@article{shao2024vividface,
title={VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping},
author={Shao, Hao and Wang, Shulun and Zhou, Yang and Song, Guanglu and He, Dailan and Qin, Shuo and Zong, Zhuofan and Ma, Bingqi and Liu, Yu and Li, Hongsheng},
journal={arXiv preprint arXiv:2412.11279},
year={2024}
}
This project is released for academic use. We disclaim responsibility for user-generated content.
Our work builds upon the following excellent projects: Deep3DFaceRecon_pytorch InsightFace AnimateDiff
All code within this repository is under Apache License 2.0.
