From 6d7bdc08eea0ddf3beb125445292025cf9e1458a Mon Sep 17 00:00:00 2001
From: Omar Sanseviero <osanseviero@gmail.com>
Date: Tue, 14 May 2024 14:03:27 +0200
Subject: [PATCH] Update docs referring to inference API

* Update models-inference.md

* tweak other pages that link to that page

---------

Co-authored-by: Julien Chaumond <julien@huggingface.co>
---
 docs/hub/index.md            |  2 +-
 docs/hub/models-inference.md | 17 ++++++++---------
 docs/hub/models-the-hub.md   |  2 +-
 docs/hub/models-widgets.md   |  2 +-
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/docs/hub/index.md b/docs/hub/index.md
index d7f0a8570..7b28e1155 100644
--- a/docs/hub/index.md
+++ b/docs/hub/index.md
@@ -109,7 +109,7 @@ The Hub offers **versioning, commit history, diffs, branches, and over a dozen l
 
 ## Models
 
-You can discover and use dozens of thousands of open-source ML models shared by the community. To promote responsible model usage and development, model repos are equipped with [Model Cards](./model-cards) to inform users of each model's limitations and biases. Additional [metadata](./model-cards#model-card-metadata) about info such as their tasks, languages, and metrics can be included, with training metrics charts even added if the repository contains [TensorBoard traces](./tensorboard). It's also easy to add an [**inference widget**](./models-widgets) to your model, allowing anyone to play with the model directly in the browser! For programmatic access, an API is provided to [**instantly serve your model**](./models-inference).
+You can discover and use dozens of thousands of open-source ML models shared by the community. To promote responsible model usage and development, model repos are equipped with [Model Cards](./model-cards) to inform users of each model's limitations and biases. Additional [metadata](./model-cards#model-card-metadata) about info such as their tasks, languages, and metrics can be included, with training metrics charts even added if the repository contains [TensorBoard traces](./tensorboard). It's also easy to add an [**inference widget**](./models-widgets) to your model, allowing anyone to play with the model directly in the browser! For programmatic access, a serverless API is provided to [**instantly serve your model**](./models-inference).
 
 To upload models to the Hub, or download models and integrate them into your work, explore the [**Models documentation**](./models). You can also choose from [**over a dozen libraries**](./models-libraries) such as 🤗 Transformers, Asteroid, and ESPnet that support the Hub.
 
diff --git a/docs/hub/models-inference.md b/docs/hub/models-inference.md
index 278870f11..0301ff5ca 100644
--- a/docs/hub/models-inference.md
+++ b/docs/hub/models-inference.md
@@ -1,9 +1,9 @@
-# Inference API
+# Serverless Inference API
 
-Please refer to [Inference API Documentation](https://huggingface.co/docs/api-inference) for detailed information.
+Please refer to [Serverless Inference API Documentation](https://huggingface.co/docs/api-inference) for detailed information.
 
 
-## What technology do you use to power the inference API?
+## What technology do you use to power the Serverless Inference API?
 
 For 🤗 Transformers models, [Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) power the API.
 
@@ -14,24 +14,23 @@ On top of `Pipelines` and depending on the model type, there are several product
 
 For models from [other libraries](./models-libraries), the API uses [Starlette](https://www.starlette.io) and runs in [Docker containers](https://github.com/huggingface/api-inference-community/tree/main/docker_images). Each library defines the implementation of [different pipelines](https://github.com/huggingface/api-inference-community/tree/main/docker_images/sentence_transformers/app/pipelines).
 
-## How can I turn off the inference API for my model?
+## How can I turn off the Serverless Inference API for my model?
 
 Specify `inference: false` in your model card's metadata.
 
-## Why don't I see an inference widget or why can't I use the inference API?
+## Why don't I see an inference widget, or why can't I use the API?
 
-For some tasks, there might not be support in the inference API, and, hence, there is no widget.
+For some tasks, there might not be support in the Serverless Inference API, and, hence, there is no widget.
 For all libraries (except 🤗 Transformers), there is a [library-to-tasks.ts file](https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/library-to-tasks.ts) of supported tasks in the API. When a model repository has a task that is not supported by the repository library, the repository has `inference: false` by default.
 
-
 ## Can I send large volumes of requests? Can I get accelerated APIs?
 
 If you are interested in accelerated inference, higher volumes of requests, or an SLA, please contact us at `api-enterprise at huggingface.co`.
 
 ## How can I see my usage?
 
-You can head to the [Inference API dashboard](https://api-inference.huggingface.co/dashboard/). Learn more about it in the [Inference API documentation](https://huggingface.co/docs/api-inference/usage).
+You can check your usage in the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage.
 
-## Is there programmatic access to the Inference API?
+## Is there programmatic access to the Serverless Inference API?
 
 Yes, the `huggingface_hub` library has a client wrapper documented [here](https://huggingface.co/docs/huggingface_hub/how-to-inference).
diff --git a/docs/hub/models-the-hub.md b/docs/hub/models-the-hub.md
index 26b8a6e2b..82e103562 100644
--- a/docs/hub/models-the-hub.md
+++ b/docs/hub/models-the-hub.md
@@ -2,7 +2,7 @@
 
 ## What is the Model Hub?
 
-The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with 🤗  [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage the [Inference API](./models-inference) to use models in production settings.
+The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with 🤗  [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage the [Serverless Inference API](./models-inference) or [Inference Endpoints](https://huggingface.co/docs/inference-endpoints). to use models in production settings.
 
 You can refer to the following video for a guide on navigating the Model Hub:
 
diff --git a/docs/hub/models-widgets.md b/docs/hub/models-widgets.md
index eceb18d9f..021d6452d 100644
--- a/docs/hub/models-widgets.md
+++ b/docs/hub/models-widgets.md
@@ -188,4 +188,4 @@ inference:
     temperature: 0.7
 ``` 
 
-The Inference API allows you to send HTTP requests to models in the Hugging Face Hub, and it's 2x to 10x faster than the widgets! ⚡⚡ Learn more about it by reading the [Inference API documentation](./models-inference). Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints).
+The Serverless inference API allows you to send HTTP requests to models in the Hugging Face Hub programatically. ⚡⚡ Learn more about it by reading the [Inference API documentation](./models-inference). Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints).