diff --git a/docs/hub/_toctree.yml b/docs/hub/_toctree.yml index 9fc9fe1d2..dce951909 100644 --- a/docs/hub/_toctree.yml +++ b/docs/hub/_toctree.yml @@ -87,6 +87,8 @@ title: Sample Factory - local: sentence-transformers title: Sentence Transformers + - local: setfit + title: SetFit - local: spacy title: spaCy - local: span_marker diff --git a/docs/hub/models-libraries.md b/docs/hub/models-libraries.md index 1da319129..11c80d53f 100644 --- a/docs/hub/models-libraries.md +++ b/docs/hub/models-libraries.md @@ -29,6 +29,7 @@ The table below summarizes the supported libraries and their level of integratio | [RL-Baselines3-Zoo](https://github.com/DLR-RM/rl-baselines3-zoo) | Training framework for Reinforcement Learning, using [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3).| ❌ | ✅ | ✅ | ✅ | | [Sample Factory](https://github.com/alex-petrenko/sample-factory) | Codebase for high throughput asynchronous reinforcement learning. | ❌ | ✅ | ✅ | ✅ | | [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) | Compute dense vector representations for sentences, paragraphs, and images. | ✅ | ✅ | ✅ | ✅ | +| [SetFit](https://github.com/huggingface/setfit) | Efficient few-shot text classification with Sentence Transformers | ✅ | ✅ | ✅ | ✅ | | [spaCy](https://github.com/explosion/spaCy) | Advanced Natural Language Processing in Python and Cython. | ✅ | ✅ | ✅ | ✅ | | [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) | Familiar, simple and state-of-the-art Named Entity Recognition. | ✅ | ✅ | ✅ | ✅ | | [Scikit Learn (using skops)](https://skops.readthedocs.io/en/stable/) | Machine Learning in Python. | ✅ | ✅ | ✅ | ✅ | diff --git a/docs/hub/setfit.md b/docs/hub/setfit.md new file mode 100644 index 000000000..d0bb3a4ec --- /dev/null +++ b/docs/hub/setfit.md @@ -0,0 +1,53 @@ +# Using SetFit with Hugging Face + +SetFit is an efficient and prompt-free framework for few-shot fine-tuning of [Sentence Transformers](https://sbert.net/). It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples 🤯! + +Compared to other few-shot learning methods, SetFit has several unique features: + +* 🗣 **No prompts or verbalizers:** Current techniques for few-shot fine-tuning require handcrafted prompts or verbalizers to convert examples into a format suitable for the underlying language model. SetFit dispenses with prompts altogether by generating rich embeddings directly from text examples. +* 🏎 **Fast to train:** SetFit doesn't require large-scale models like [T0](https://huggingface.co/bigscience/T0) or GPT-3 to achieve high accuracy. As a result, it is typically an order of magnitude (or more) faster to train and run inference with. +* 🌎 **Multilingual support**: SetFit can be used with any [Sentence Transformer](https://huggingface.co/models?library=sentence-transformers&sort=downloads) on the Hub, which means you can classify text in multiple languages by simply fine-tuning a multilingual checkpoint. + +## Exploring SetFit on the Hub + +You can find SetFit models by filtering at the left of the [models page](https://huggingface.co/models?library=setfit). + +All models on the Hub come with these useful features: +1. An automatically generated model card with a brief description. +2. An interactive widget you can use to play with the model directly in the browser. +3. An Inference API that allows you to make inference requests. + +## Installation + +To get started, you can follow the [SetFit installation guide](https://huggingface.co/docs/setfit/installation). You can also use the following one-line install through pip: + +``` +pip install -U setfit +``` + +## Using existing models + +All `setfit` models can easily be loaded from the Hub. + +```py +from setfit import SetFitModel + +model = SetFitModel.from_pretrained("tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2-8-shot") +``` + +Once loaded, you can use [`SetFitModel.predict`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitModel.predict) to perform inference. + +```py +model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.") +``` +```bash +['positive', 'negative'] +``` + +If you want to load a specific SetFit model, you can click `Use in SetFit` and you will be given a working snippet! + +## Additional resources +* [All SetFit models available on the Hub](https://huggingface.co/models?library=setfit) +* SetFit [repository](https://github.com/huggingface/setfit) +* SetFit [docs](https://huggingface.co/docs/setfit) +* SetFit [paper](https://arxiv.org/abs/2209.11055)