Skip to content

Latest commit

 

History

History
97 lines (73 loc) · 4.65 KB

README.md

File metadata and controls

97 lines (73 loc) · 4.65 KB

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

Yangyi Huang* · Hongwei Yi* · Yuliang Xiu* · Tingting Liao · Jiaxiang Tang · Deng Cai · Justus Thies
* Equal contribution

3DV 2024

teaser.mp4


Paper PDF Project Page youtube views


TeCH considers image-based reconstruction as a conditional generation task, taking conditions from both the input image and the derived descriptions. It is capable of reconstructing "lifelike" 3D clothed humans. “Lifelike” refers to 1) a detailed full-body geometry, including facial features and clothing wrinkles, in both frontal and unseen regions, and 2) a high-quality texture with consistent color and intricate patterns.

Installation

Please follow the Installation Instruction to setup all the required packages.

Getting Started

We provide a running script at scripts/run.sh. Before getting started, you need to set your own environment variables of CUDA_HOME and REPLICATE_API_TOKEN(get your token here) in the script.

After that, you can use TeCH to create a highly detailed clothed human textured mesh from a single image, for example:

sh scripts/run.sh input/examples/name.img exp/examples/name

The results will be saved in the experiment folder exp/examples/name, and the textured mesh will be saved as exp/examples/name/obj/name_texture.obj

It is noted that in "Step 3", the current version of Dreambooth implementation requires 2*32G GPU memory. And 1*32G GPU memory is efficient for other steps. The entire training process for a subject takes ~3 hours on our V100 GPUs.

TODOs

  • Release of evaluation protocols and results data for comparison (on CAPE & THUman 2.0 datasets).
  • Switch to the diffuser version of DreamBooth to save training memory.
  • Further improvement of efficiency and robustness.

Citation

@inproceedings{huang2024tech,
  title={{TeCH: Text-guided Reconstruction of Lifelike Clothed Humans}},
  author={Huang, Yangyi and Yi, Hongwei and Xiu, Yuliang and Liao, Tingting and Tang, Jiaxiang and Cai, Deng and Thies, Justus},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2024}
}

Contributors

Kudos to all of our amazing contributors! TeCH thrives through open-source. In that spirit, we welcome all kinds of contributions from the community.

Contributor avatars are randomly shuffled.


License

This code and model are available only for non-commercial research purposes as defined in the LICENSE (i.e., MIT LICENSE). Note that, using TeCH, you have to register SMPL-X and agree with the LICENSE of it, and it's not MIT LICENSE, you can check the LICENSE of SMPL-X from https://github.com/vchoutas/smplx/blob/main/LICENSE.

Acknowledgment

This implementation is mainly built based on Stable Dreamfusion, ECON, DreamBooth-Stable-Diffusion, and the BLIP API from Salesforce on Replicate