From 6d7bdc08eea0ddf3beb125445292025cf9e1458a Mon Sep 17 00:00:00 2001 From: Omar Sanseviero Date: Tue, 14 May 2024 14:03:27 +0200 Subject: [PATCH] Update docs referring to inference API * Update models-inference.md * tweak other pages that link to that page --------- Co-authored-by: Julien Chaumond --- docs/hub/index.md | 2 +- docs/hub/models-inference.md | 17 ++++++++--------- docs/hub/models-the-hub.md | 2 +- docs/hub/models-widgets.md | 2 +- 4 files changed, 11 insertions(+), 12 deletions(-) diff --git a/docs/hub/index.md b/docs/hub/index.md index d7f0a8570..7b28e1155 100644 --- a/docs/hub/index.md +++ b/docs/hub/index.md @@ -109,7 +109,7 @@ The Hub offers **versioning, commit history, diffs, branches, and over a dozen l ## Models -You can discover and use dozens of thousands of open-source ML models shared by the community. To promote responsible model usage and development, model repos are equipped with [Model Cards](./model-cards) to inform users of each model's limitations and biases. Additional [metadata](./model-cards#model-card-metadata) about info such as their tasks, languages, and metrics can be included, with training metrics charts even added if the repository contains [TensorBoard traces](./tensorboard). It's also easy to add an [**inference widget**](./models-widgets) to your model, allowing anyone to play with the model directly in the browser! For programmatic access, an API is provided to [**instantly serve your model**](./models-inference). +You can discover and use dozens of thousands of open-source ML models shared by the community. To promote responsible model usage and development, model repos are equipped with [Model Cards](./model-cards) to inform users of each model's limitations and biases. Additional [metadata](./model-cards#model-card-metadata) about info such as their tasks, languages, and metrics can be included, with training metrics charts even added if the repository contains [TensorBoard traces](./tensorboard). It's also easy to add an [**inference widget**](./models-widgets) to your model, allowing anyone to play with the model directly in the browser! For programmatic access, a serverless API is provided to [**instantly serve your model**](./models-inference). To upload models to the Hub, or download models and integrate them into your work, explore the [**Models documentation**](./models). You can also choose from [**over a dozen libraries**](./models-libraries) such as ๐Ÿค— Transformers, Asteroid, and ESPnet that support the Hub. diff --git a/docs/hub/models-inference.md b/docs/hub/models-inference.md index 278870f11..0301ff5ca 100644 --- a/docs/hub/models-inference.md +++ b/docs/hub/models-inference.md @@ -1,9 +1,9 @@ -# Inference API +# Serverless Inference API -Please refer to [Inference API Documentation](https://huggingface.co/docs/api-inference) for detailed information. +Please refer to [Serverless Inference API Documentation](https://huggingface.co/docs/api-inference) for detailed information. -## What technology do you use to power the inference API? +## What technology do you use to power the Serverless Inference API? For ๐Ÿค— Transformers models, [Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) power the API. @@ -14,24 +14,23 @@ On top of `Pipelines` and depending on the model type, there are several product For models from [other libraries](./models-libraries), the API uses [Starlette](https://www.starlette.io) and runs in [Docker containers](https://github.com/huggingface/api-inference-community/tree/main/docker_images). Each library defines the implementation of [different pipelines](https://github.com/huggingface/api-inference-community/tree/main/docker_images/sentence_transformers/app/pipelines). -## How can I turn off the inference API for my model? +## How can I turn off the Serverless Inference API for my model? Specify `inference: false` in your model card's metadata. -## Why don't I see an inference widget or why can't I use the inference API? +## Why don't I see an inference widget, or why can't I use the API? -For some tasks, there might not be support in the inference API, and, hence, there is no widget. +For some tasks, there might not be support in the Serverless Inference API, and, hence, there is no widget. For all libraries (except ๐Ÿค— Transformers), there is a [library-to-tasks.ts file](https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/library-to-tasks.ts) of supported tasks in the API. When a model repository has a task that is not supported by the repository library, the repository has `inference: false` by default. - ## Can I send large volumes of requests? Can I get accelerated APIs? If you are interested in accelerated inference, higher volumes of requests, or an SLA, please contact us at `api-enterprise at huggingface.co`. ## How can I see my usage? -You can head to the [Inference API dashboard](https://api-inference.huggingface.co/dashboard/). Learn more about it in the [Inference API documentation](https://huggingface.co/docs/api-inference/usage). +You can check your usage in the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. -## Is there programmatic access to the Inference API? +## Is there programmatic access to the Serverless Inference API? Yes, the `huggingface_hub` library has a client wrapper documented [here](https://huggingface.co/docs/huggingface_hub/how-to-inference). diff --git a/docs/hub/models-the-hub.md b/docs/hub/models-the-hub.md index 26b8a6e2b..82e103562 100644 --- a/docs/hub/models-the-hub.md +++ b/docs/hub/models-the-hub.md @@ -2,7 +2,7 @@ ## What is the Model Hub? -The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with ๐Ÿค— [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage the [Inference API](./models-inference) to use models in production settings. +The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with ๐Ÿค— [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage the [Serverless Inference API](./models-inference) or [Inference Endpoints](https://huggingface.co/docs/inference-endpoints). to use models in production settings. You can refer to the following video for a guide on navigating the Model Hub: diff --git a/docs/hub/models-widgets.md b/docs/hub/models-widgets.md index eceb18d9f..021d6452d 100644 --- a/docs/hub/models-widgets.md +++ b/docs/hub/models-widgets.md @@ -188,4 +188,4 @@ inference: temperature: 0.7 ``` -The Inference API allows you to send HTTP requests to models in the Hugging Face Hub, and it's 2x to 10x faster than the widgets! โšกโšก Learn more about it by reading the [Inference API documentation](./models-inference). Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints). +The Serverless inference API allows you to send HTTP requests to models in the Hugging Face Hub programatically. โšกโšก Learn more about it by reading the [Inference API documentation](./models-inference). Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints).