Name	Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets	assets
extern	extern
src	src
.gitignore	.gitignore
.gitmodules	.gitmodules
LICENSE	LICENSE
README.md	README.md
app.py	app.py
inference.py	inference.py
requirements.txt	requirements.txt

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

3DEnhancer employs a multi-view diffusion model to enhance multi-view images, thus improving 3D models.

For more visual results, go checkout our project page

Introducing 3DEnhancer

Despite advances in neural rendering, due to the scarcity of high-quality 3D datasets and the inherent limitations of multi-view diffusion models, view synthesis and 3D model generation are restricted to low resolutions with suboptimal multi-view consistency. In this study, we present a novel 3D enhancement pipeline, dubbed 3DEnhancer, which employs a multi-view latent diffusion model to enhance coarse 3D inputs while preserving multi-view consistency. Our method includes a pose-aware encoder and a diffusion-based denoiser to refine low-quality multi-view images, along with data augmentation and a multi-view attention module with epipolar aggregation to maintain consistent, high-quality 3D outputs across views. Unlike existing video-based approaches, our model supports seamless multi-view enhancement with improved coherence across diverse viewing angles. Extensive evaluations show that 3DEnhancer significantly outperforms existing methods, boosting both multi-view enhancement and per-instance 3D optimization tasks.

News

[2024/03/08] Our inference code and Gradio demo are released.
[2024/12/25] Our paper and project page are now live. Merry Christmas!

Installation

Clone Repo

git clone --recurse-submodules https://github.com/Luo-Yihang/3DEnhancer cd 3DEnhancer
Create Conda Environment

conda create -n 3denhancer python=3.10 -y conda activate 3denhancer
Install Python Dependencies

Important: Install Torch and Xformers based on your CUDA version. For example, for Torch 2.1.0 + CUDA 11.8:

# Install Torch and Xformers pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118 pip install -U xformers --index-url https://download.pytorch.org/whl/cu118 # Install other dependencies pip install -r requirements.txt

Pretrained Weights

Download the pretrained model from Hugging Face and place it under pretrained_models/3DEnhancer:

mkdir -p pretrained_models/3DEnhancer wget -P pretrained_models/3DEnhancer https://huggingface.co/Luo-Yihang/3DEnhancer/resolve/main/model.safetensors

Inference

The code has been tested on NVIDIA A100 and V100 GPUs. An NVIDIA GPU with at least 18GB of memory is required.

We provide example inputs in assets/examples/mv_lq, where each subfolder contains four sequential multi-view images. Perform inference on multi-view images using an aligned prompt and noise_level. For example:

python inference.py \ --input_folder assets/examples/mv_lq/vase \ --output_folder results/vase \ --prompt "vase" \ --noise_level 0

For more options, refer to inference.py.

Demo

The script app.py provides a simple web demo for generating and enhancing multi-view images, as well as reconstructing 3D models using LGM.

Install a modified Gaussian splatting (with depth and alpha rendering) required for LGM:

git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization pip install ./diff-gaussian-rasterization

Download the LGM pretrained weights from Hugging Face and place it under pretrained_models/LGM:

mkdir -p pretrained_models/LGM wget -P pretrained_models/LGM https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16_fixrot.safetensors

After installing the dependencies, start the demo with:

python app.py

The web demo is also available on Hugging Face Spaces!

TODO

Release paper and project page.
Release inference code.
Release Gradio demo.

License

This project is licensed under NTU S-Lab License 1.0. Redistribution and use should follow this license.

Citation

If you find our code or paper helps, please consider citing:

@article{luo20243denhancer, title={3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement}, author={Yihang Luo and Shangchen Zhou and Yushi Lan and Xingang Pan and Chen Change Loy}, booktitle={arXiv preprint arXiv:2412.18565} year={2024}, }

Contact

If you have any questions, please feel free to reach us at luo_yihang@outlook.com.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Luo-Yihang/3DEnhancer

Folders and files

Latest commit

History

Repository files navigation

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

News

Installation

Pretrained Weights

Inference

Demo

TODO

License

Citation

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages