Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Commit

Permalink
Merge pull request #736 from rsepassi/push
Browse files Browse the repository at this point in the history
v1.6.0
  • Loading branch information
lukaszkaiser authored Apr 20, 2018
2 parents 5ac81b4 + 8cf5fa4 commit 99750c4
Show file tree
Hide file tree
Showing 98 changed files with 1,620 additions and 1,306 deletions.
8 changes: 6 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# Compiled python modules.
*.pyc
*DS_Store


# Byte-compiled
_pycache__/
Expand All @@ -18,3 +16,9 @@ dist/
# Sublime project files
*.sublime-project
*.sublime-workspace

# Tests
.pytest_cache/

# Other
*.DS_Store
9 changes: 3 additions & 6 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,11 @@ env:
- T2T_DATA_DIR=/tmp/t2t-data
- T2T_TRAIN_DIR=/tmp/t2t-train
matrix:
- TF_VERSION="1.4.*"
- TF_VERSION="1.5.*"
- TF_VERSION="1.6.*"
- TF_VERSION="1.7.*"
matrix:
exclude:
- python: "3.6"
env: TF_VERSION="1.4.*"
- python: "3.6"
env: TF_VERSION="1.5.*"
- python: "3.6"
Expand Down Expand Up @@ -57,13 +54,13 @@ script:

# Run data generation, training, and decoding on a dummy problem
- t2t-datagen --problem=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR
- t2t-trainer --problems=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --train_steps=5 --eval_steps=5 --output_dir=$T2T_TRAIN_DIR
- t2t-decoder --problems=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --output_dir=$T2T_TRAIN_DIR --decode_hparams='num_samples=10'
- t2t-trainer --problem=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --train_steps=5 --eval_steps=5 --output_dir=$T2T_TRAIN_DIR
- t2t-decoder --problem=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --output_dir=$T2T_TRAIN_DIR --decode_hparams='num_samples=10'

# Export and query (on Python 2 only)
# Bug: https://github.com/tensorflow/serving/issues/819
#- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]] && [[ "$TF_VERSION" == "1.6.*" ]]; then
# t2t-exporter --problems=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --output_dir=$T2T_TRAIN_DIR;
# t2t-exporter --problem=$T2T_PROBLEM --data_dir=$T2T_DATA_DIR --model=transformer --hparams_set=transformer_tiny --output_dir=$T2T_TRAIN_DIR;
# pip install tensorflow-serving-api;
# tensorflow_model_server --port=9000 --model_name=my_model --model_base_path=$T2T_TRAIN_DIR/export/Servo &
# sleep 10;
Expand Down
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ pip install tensor2tensor && t2t-trainer \
--generate_data \
--data_dir=~/t2t_data \
--output_dir=~/t2t_train/mnist \
--problems=image_mnist \
--problem=image_mnist \
--model=shake_shake \
--hparams_set=shake_shake_quick \
--train_steps=1000 \
Expand Down Expand Up @@ -78,13 +78,13 @@ to modify the hyperparameters if you run on a different setup.
### Image Classification

For image classification, we have a number of standard data-sets:
* ImageNet (a large data-set): `--problems=image_imagenet`, or one
* ImageNet (a large data-set): `--problem=image_imagenet`, or one
of the re-scaled versions (`image_imagenet224`, `image_imagenet64`,
`image_imagenet32`)
* CIFAR-10: `--problems=image_cifar10` (or
`--problems=image_cifar10_plain` to turn off data augmentation)
* CIFAR-100: `--problems=image_cifar100`
* MNIST: `--problems=image_mnist`
* CIFAR-10: `--problem=image_cifar10` (or
`--problem=image_cifar10_plain` to turn off data augmentation)
* CIFAR-100: `--problem=image_cifar100`
* MNIST: `--problem=image_mnist`

For ImageNet, we suggest to use the ResNet or Xception, i.e.,
use `--model=resnet --hparams_set=resnet_50` or
Expand All @@ -99,11 +99,11 @@ close to 97% accuracy on CIFAR-10.
### Language Modeling

For language modeling, we have these data-sets in T2T:
* PTB (a small data-set): `--problems=languagemodel_ptb10k` for
word-level modeling and `--problems=languagemodel_ptb_characters`
* PTB (a small data-set): `--problem=languagemodel_ptb10k` for
word-level modeling and `--problem=languagemodel_ptb_characters`
for character-level modeling.
* LM1B (a billion-word corpus): `--problems=languagemodel_lm1b32k` for
subword-level modeling and `--problems=languagemodel_lm1b_characters`
* LM1B (a billion-word corpus): `--problem=languagemodel_lm1b32k` for
subword-level modeling and `--problem=languagemodel_lm1b_characters`
for character-level modeling.

We suggest to start with `--model=transformer` on this task and use
Expand All @@ -113,7 +113,7 @@ We suggest to start with `--model=transformer` on this task and use
### Sentiment Analysis

For the task of recognizing the sentiment of a sentence, use
* the IMDB data-set: `--problems=sentiment_imdb`
* the IMDB data-set: `--problem=sentiment_imdb`

We suggest to use `--model=transformer_encoder` here and since it is
a small data-set, try `--hparams_set=transformer_tiny` and train for
Expand All @@ -122,15 +122,15 @@ few steps (e.g., `--train_steps=2000`).
### Speech Recognition

For speech-to-text, we have these data-sets in T2T:
* Librispeech (English speech to text): `--problems=librispeech` for
the whole set and `--problems=librispeech_clean` for a smaller
* Librispeech (English speech to text): `--problem=librispeech` for
the whole set and `--problem=librispeech_clean` for a smaller
but nicely filtered part.

### Summarization

For summarizing longer text into shorter one we have these data-sets:
* CNN/DailyMail articles summarized into a few sentences:
`--problems=summarize_cnn_dailymail32k`
`--problem=summarize_cnn_dailymail32k`

We suggest to use `--model=transformer` and
`--hparams_set=transformer_prepend` for this task.
Expand All @@ -139,15 +139,15 @@ This yields good ROUGE scores.
### Translation

There are a number of translation data-sets in T2T:
* English-German: `--problems=translate_ende_wmt32k`
* English-French: `--problems=translate_enfr_wmt32k`
* English-Czech: `--problems=translate_encs_wmt32k`
* English-Chinese: `--problems=translate_enzh_wmt32k`
* English-Vietnamese: `--problems=translate_envi_iwslt32k`
* English-German: `--problem=translate_ende_wmt32k`
* English-French: `--problem=translate_enfr_wmt32k`
* English-Czech: `--problem=translate_encs_wmt32k`
* English-Chinese: `--problem=translate_enzh_wmt32k`
* English-Vietnamese: `--problem=translate_envi_iwslt32k`

You can get translations in the other direction by appending `_rev` to
the problem name, e.g., for German-English use
`--problems=translate_ende_wmt32k_rev`.
`--problem=translate_ende_wmt32k_rev`.

For all translation problems, we suggest to try the Transformer model:
`--model=transformer`. At first it is best to try the base setting,
Expand Down Expand Up @@ -193,7 +193,7 @@ t2t-datagen \
# * If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--problem=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR
Expand All @@ -210,7 +210,7 @@ ALPHA=0.6
t2t-decoder \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--problem=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR \
Expand Down Expand Up @@ -325,7 +325,7 @@ and hyperparameter set functions can compose other hyperparameter set functions.

The **trainer** binary is the main entrypoint for training, evaluation, and
inference. Users can easily switch between problems, models, and hyperparameter
sets by using the `--model`, `--problems`, and `--hparams_set` flags. Specific
sets by using the `--model`, `--problem`, and `--hparams_set` flags. Specific
hyperparameters can be overridden with the `--hparams` flag. `--schedule` and
related flags control local and distributed training/evaluation
([distributed training documentation](https://github.com/tensorflow/tensor2tensor/tree/master/docs/distributed_training.md)).
Expand Down
4 changes: 2 additions & 2 deletions docs/cloud_mlengine.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ It's the same `t2t-trainer` you know and love with the addition of the
DATA_DIR=gs://my-bucket/data
OUTPUT_DIR=gs://my-bucket/train
t2t-trainer \
--problems=translate_ende_wmt32k \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_base \
--data_dir=$DATA_DIR \
Expand Down Expand Up @@ -57,7 +57,7 @@ with `--hparams_range` and the `--autotune_*` flags:

```
t2t-trainer \
--problems=translate_ende_wmt32k \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_base \
--data_dir=$DATA_DIR \
Expand Down
6 changes: 2 additions & 4 deletions docs/cloud_tpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,6 @@ work on any image classification data-set.

## Tutorial: Transformer En-De translation on TPU

**Note**: You'll need TensorFlow 1.5+.

Configure the `gcloud` CLI:
```
gcloud components update
Expand Down Expand Up @@ -71,7 +69,7 @@ Launch! It's as simple as adding the `--cloud_tpu` flag.
t2t-trainer \
--model=transformer \
--hparams_set=transformer_tpu \
--problems=translate_ende_wmt8k \
--problem=translate_ende_wmt8k \
--train_steps=10 \
--eval_steps=10 \
--local_eval_frequency=10 \
Expand Down Expand Up @@ -109,7 +107,7 @@ For example, to train a shake-shake model on CIFAR you can run this command.
t2t-trainer \
--model=shake_shake \
--hparams_set=shakeshake_tpu \
--problems=image_cifar10 \
--problem=image_cifar10 \
--train_steps=180000 \
--eval_steps=9 \
--local_eval_frequency=100 \
Expand Down
38 changes: 19 additions & 19 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,13 @@ to modify the hyperparameters if you run on a different setup.
### Image Classification

For image classification, we have a number of standard data-sets:
* ImageNet (a large data-set): `--problems=image_imagenet`, or one
* ImageNet (a large data-set): `--problem=image_imagenet`, or one
of the re-scaled versions (`image_imagenet224`, `image_imagenet64`,
`image_imagenet32`)
* CIFAR-10: `--problems=image_cifar10` (or
`--problems=image_cifar10_plain` to turn off data augmentation)
* CIFAR-100: `--problems=image_cifar100`
* MNIST: `--problems=image_mnist`
* CIFAR-10: `--problem=image_cifar10` (or
`--problem=image_cifar10_plain` to turn off data augmentation)
* CIFAR-100: `--problem=image_cifar100`
* MNIST: `--problem=image_mnist`

For ImageNet, we suggest to use the ResNet or Xception, i.e.,
use `--model=resnet --hparams_set=resnet_50` or
Expand All @@ -63,11 +63,11 @@ close to 97% accuracy on CIFAR-10.
### Language Modeling

For language modeling, we have these data-sets in T2T:
* PTB (a small data-set): `--problems=languagemodel_ptb10k` for
word-level modeling and `--problems=languagemodel_ptb_characters`
* PTB (a small data-set): `--problem=languagemodel_ptb10k` for
word-level modeling and `--problem=languagemodel_ptb_characters`
for character-level modeling.
* LM1B (a billion-word corpus): `--problems=languagemodel_lm1b32k` for
subword-level modeling and `--problems=languagemodel_lm1b_characters`
* LM1B (a billion-word corpus): `--problem=languagemodel_lm1b32k` for
subword-level modeling and `--problem=languagemodel_lm1b_characters`
for character-level modeling.

We suggest to start with `--model=transformer` on this task and use
Expand All @@ -77,7 +77,7 @@ We suggest to start with `--model=transformer` on this task and use
### Sentiment Analysis

For the task of recognizing the sentiment of a sentence, use
* the IMDB data-set: `--problems=sentiment_imdb`
* the IMDB data-set: `--problem=sentiment_imdb`

We suggest to use `--model=transformer_encoder` here and since it is
a small data-set, try `--hparams_set=transformer_tiny` and train for
Expand All @@ -86,15 +86,15 @@ few steps (e.g., `--train_steps=2000`).
### Speech Recognition

For speech-to-text, we have these data-sets in T2T:
* Librispeech (English speech to text): `--problems=librispeech` for
the whole set and `--problems=librispeech_clean` for a smaller
* Librispeech (English speech to text): `--problem=librispeech` for
the whole set and `--problem=librispeech_clean` for a smaller
but nicely filtered part.

### Summarization

For summarizing longer text into shorter one we have these data-sets:
* CNN/DailyMail articles summarized into a few sentences:
`--problems=summarize_cnn_dailymail32k`
`--problem=summarize_cnn_dailymail32k`

We suggest to use `--model=transformer` and
`--hparams_set=transformer_prepend` for this task.
Expand All @@ -103,15 +103,15 @@ This yields good ROUGE scores.
### Translation

There are a number of translation data-sets in T2T:
* English-German: `--problems=translate_ende_wmt32k`
* English-French: `--problems=translate_enfr_wmt32k`
* English-Czech: `--problems=translate_encs_wmt32k`
* English-Chinese: `--problems=translate_enzh_wmt32k`
* English-Vietnamese: `--problems=translate_envi_iwslt32k`
* English-German: `--problem=translate_ende_wmt32k`
* English-French: `--problem=translate_enfr_wmt32k`
* English-Czech: `--problem=translate_encs_wmt32k`
* English-Chinese: `--problem=translate_enzh_wmt32k`
* English-Vietnamese: `--problem=translate_envi_iwslt32k`

You can get translations in the other direction by appending `_rev` to
the problem name, e.g., for German-English use
`--problems=translate_ende_wmt32k_rev`.
`--problem=translate_ende_wmt32k_rev`.

For all translation problems, we suggest to try the Transformer model:
`--model=transformer`. At first it is best to try the base setting,
Expand Down
2 changes: 1 addition & 1 deletion docs/new_problem.md
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,6 @@ clone the repository and install it in developer mode with `pip install -e .`.
# Train!

You can train exactly as you do in the [walkthrough](walkthrough.md) with flags
`--problems=poetry_lines` and `--t2t_usr_dir=$USR_DIR`.
`--problem=poetry_lines` and `--t2t_usr_dir=$USR_DIR`.

All done. Let us know what amazing poetry your model writes!
4 changes: 2 additions & 2 deletions docs/tutorials/asr_with_transformer.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To train a model on GPU set up`OUT_DIR` and run the trainer:
t2t-trainer \
--model=transformer \
--hparams_set=transformer_librispeech \
--problems=librispeech \
--problem=librispeech \
--train_steps=120000 \
--eval_steps=3 \
--local_eval_frequency=100 \
Expand All @@ -48,7 +48,7 @@ To train a model on TPU set up `OUT_DIR` and run the trainer:
t2t-trainer \
--model=transformer \
--hparams_set=transformer_librispeech_tpu \
--problems=librispeech \
--problem=librispeech \
--train_steps=120000 \
--eval_steps=3 \
--local_eval_frequency=100 \
Expand Down
Loading

0 comments on commit 99750c4

Please sign in to comment.