diff --git a/README.md b/README.md index 00a0f850a..10c302308 100644 --- a/README.md +++ b/README.md @@ -89,9 +89,9 @@ There is also the master branch that contains all of our most recent work, but c ### Building fastText using make (preferred) ``` -$ wget https://github.com/facebookresearch/fastText/archive/v0.9.1.zip -$ unzip v0.9.1.zip -$ cd fastText-0.9.1 +$ wget https://github.com/facebookresearch/fastText/archive/v0.9.2.zip +$ unzip v0.9.2.zip +$ cd fastText-0.9.2 $ make ``` diff --git a/docs/supervised-tutorial.md b/docs/supervised-tutorial.md index b7a6eafef..1f0a1fb06 100644 --- a/docs/supervised-tutorial.md +++ b/docs/supervised-tutorial.md @@ -18,14 +18,14 @@ The first step of this tutorial is to install and build fastText. It only requir Let us start by downloading the [most recent release](https://github.com/facebookresearch/fastText/releases): ```bash -$ wget https://github.com/facebookresearch/fastText/archive/v0.9.1.zip -$ unzip v0.9.1.zip +$ wget https://github.com/facebookresearch/fastText/archive/v0.9.2.zip +$ unzip v0.9.2.zip ``` Move to the fastText directory and build it: ```bash -$ cd fastText-0.9.1 +$ cd fastText-0.9.2 # for command line tool : $ make # for python bindings : @@ -80,32 +80,32 @@ DESCRIPTION FUNCTIONS load_model(path) Load a model given a filepath and return a model object. - + read_args(arg_list, arg_dict, arg_names, default_values) - + tokenize(text) Given a string of text, tokenize it and return a list of tokens - + train_supervised(*kargs, **kwargs) Train a supervised model and return a model object. - + input must be a filepath. The input text does not need to be tokenized as per the tokenize function, but it must be preprocessed and encoded as UTF-8. You might want to consult standard preprocessing scripts such as tokenizer.perl mentioned here: http://www.statmt.org/wmt07/baseline.html - + The input file must must contain at least one label per line. For an example consult the example datasets which are part of the fastText repository such as the dataset pulled by classification-example.sh. - + train_unsupervised(*kargs, **kwargs) Train an unsupervised model and return a model object. - + input must be a filepath. The input text does not need to be tokenized as per the tokenize function, but it must be preprocessed and encoded as UTF-8. You might want to consult standard preprocessing scripts such as tokenizer.perl mentioned here: http://www.statmt.org/wmt07/baseline.html - + The input field must not contain any labels or use the specified label prefix unless it is ok for those words to be ignored. For an example consult the dataset pulled by the example script word-vector-example.sh, which is @@ -366,7 +366,7 @@ This is much better! Another way to change the learning speed of our model is to ```bash ->> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0 +>> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0 Read 0M words Number of words: 9012 Number of labels: 734 diff --git a/setup.py b/setup.py index cc7203a8d..caf445f11 100644 --- a/setup.py +++ b/setup.py @@ -21,7 +21,7 @@ import platform import io -__version__ = '0.9.1' +__version__ = '0.9.2' FASTTEXT_SRC = "src" # Based on https://github.com/pybind/python_example