Skip to content

Commit

Permalink
Integrate SetFit library (#1150)
Browse files Browse the repository at this point in the history
* Integrate SetFit library

* Apply suggestions from code review

Co-authored-by: Merve Noyan <[email protected]>

* Add link to T0

* Apply Pedro's suggestions

Thanks @pcuenca :)

Co-authored-by: Pedro Cuenca <[email protected]>

---------

Co-authored-by: Merve Noyan <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
  • Loading branch information
3 people authored Dec 5, 2023
1 parent 5add7a8 commit c65d133
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/hub/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@
title: Sample Factory
- local: sentence-transformers
title: Sentence Transformers
- local: setfit
title: SetFit
- local: spacy
title: spaCy
- local: span_marker
Expand Down
1 change: 1 addition & 0 deletions docs/hub/models-libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ The table below summarizes the supported libraries and their level of integratio
| [RL-Baselines3-Zoo](https://github.com/DLR-RM/rl-baselines3-zoo) | Training framework for Reinforcement Learning, using [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3).|||||
| [Sample Factory](https://github.com/alex-petrenko/sample-factory) | Codebase for high throughput asynchronous reinforcement learning. |||||
| [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) | Compute dense vector representations for sentences, paragraphs, and images. |||||
| [SetFit](https://github.com/huggingface/setfit) | Efficient few-shot text classification with Sentence Transformers |||||
| [spaCy](https://github.com/explosion/spaCy) | Advanced Natural Language Processing in Python and Cython. |||||
| [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) | Familiar, simple and state-of-the-art Named Entity Recognition. |||||
| [Scikit Learn (using skops)](https://skops.readthedocs.io/en/stable/) | Machine Learning in Python. |||||
Expand Down
53 changes: 53 additions & 0 deletions docs/hub/setfit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Using SetFit with Hugging Face

SetFit is an efficient and prompt-free framework for few-shot fine-tuning of [Sentence Transformers](https://sbert.net/). It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples 🤯!

Compared to other few-shot learning methods, SetFit has several unique features:

* 🗣 **No prompts or verbalizers:** Current techniques for few-shot fine-tuning require handcrafted prompts or verbalizers to convert examples into a format suitable for the underlying language model. SetFit dispenses with prompts altogether by generating rich embeddings directly from text examples.
* 🏎 **Fast to train:** SetFit doesn't require large-scale models like [T0](https://huggingface.co/bigscience/T0) or GPT-3 to achieve high accuracy. As a result, it is typically an order of magnitude (or more) faster to train and run inference with.
* 🌎 **Multilingual support**: SetFit can be used with any [Sentence Transformer](https://huggingface.co/models?library=sentence-transformers&sort=downloads) on the Hub, which means you can classify text in multiple languages by simply fine-tuning a multilingual checkpoint.

## Exploring SetFit on the Hub

You can find SetFit models by filtering at the left of the [models page](https://huggingface.co/models?library=setfit).

All models on the Hub come with these useful features:
1. An automatically generated model card with a brief description.
2. An interactive widget you can use to play with the model directly in the browser.
3. An Inference API that allows you to make inference requests.

## Installation

To get started, you can follow the [SetFit installation guide](https://huggingface.co/docs/setfit/installation). You can also use the following one-line install through pip:

```
pip install -U setfit
```

## Using existing models

All `setfit` models can easily be loaded from the Hub.

```py
from setfit import SetFitModel

model = SetFitModel.from_pretrained("tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2-8-shot")
```

Once loaded, you can use [`SetFitModel.predict`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitModel.predict) to perform inference.

```py
model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
```
```bash
['positive', 'negative']
```

If you want to load a specific SetFit model, you can click `Use in SetFit` and you will be given a working snippet!

## Additional resources
* [All SetFit models available on the Hub](https://huggingface.co/models?library=setfit)
* SetFit [repository](https://github.com/huggingface/setfit)
* SetFit [docs](https://huggingface.co/docs/setfit)
* SetFit [paper](https://arxiv.org/abs/2209.11055)

0 comments on commit c65d133

Please sign in to comment.