Skip to content

A continuous diffusion language model for sequence-to-sequence tasks that uses a pretrained BERT model for word embedding.

License

Notifications You must be signed in to change notification settings

hungphongtrn/SDPE

 
 

Repository files navigation

Sequence-to-Sequence Continuous Diffusion Language Models for Control Style Transfering

This repository is the official implementation of the models introduced in Sequence-to-Sequence Continuous Diffusion Language Models for Control Style Transfering.

The implementation is based on the BERT replication of Diffusion-LM Improves Controllable Text Generation.

@article{Li-2022-DiffusionLM,
  title={Diffusion-LM Improves Controllable Text Generation},
  author={Xiang Lisa Li and John Thickstun and Ishaan Gulrajani and Percy Liang and Tatsunori Hashimoto},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.14217}
}

Setup

All models were trained using an NVIDIA A6000 GPU with 45 GiB RAM, in 20 epochs, with 250-500 diffusion steps. Details can be found in the paper.

Requirements

To install requirements:

pip install -r requirements.txt

A pre-trained model is required as the backbone model for all of the models introduced in the paper. Some suggested models:

The downloaded pre-trained model should be put inside base/.

Training

To train the model, run this command:

python -m train.py

All checkpoints can be found in domains/[DATASET NAME]/checkpoints/.

Contributors

About

A continuous diffusion language model for sequence-to-sequence tasks that uses a pretrained BERT model for word embedding.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.9%
  • Python 4.1%