First draft for text-to-image, image-to-image + generate script #1384

Wauplin · 2024-08-20T14:25:41Z

(related to #1379)

cc @osanseviero for viz'

End goal is to generate this page based on info from:

input specs
output specs
the task page https://huggingface.co/api/tasks
(maybe huggingface_hub doc example?)
(maybe huggingface.js doc example?)

I wrote the content in this PR manually to validate the format. I have added:

a small description from the tasks page
a link to https://huggingface.co/tasks/text-to-image for more info
a list of recommended models from the tasks page
the API specification
- inputs: payload (from specs) and headers (will always be the same)
- output: payload (from specs). In the text-to-image example it's annoying because the output is not a json, no not describable using openschema. We should find a proxy to say "it's just bytes" in the specs so that docs are generated correctly.
examples:
- CURL => how to generate (?)
- Python => from hugginface_hub example (?). Add link to docs.
- JavaScript => from huggingface.js (?). Add link to docs.

Open questions:

we should add an example URL like https://api-inference.huggingface.co/models/black-forest-labs/FLUX.1-dev. Depends on what we do for the curl example but we need it in any case.
should we add a curl example? is it possible to generate it? are the current examples maintained?
we should harmonize python/javascript snippets with the ones from https://huggingface.co/black-forest-labs/FLUX.1-dev?inference_api=true? If yes, how?

Anything else?

osanseviero · 2024-08-20T14:51:55Z

should we add a curl example? is it possible to generate it? are the current examples maintained?

Yes, adding a curl example is important and were recently updated 👍

Wauplin · 2024-08-20T15:39:45Z

I've updated the draft based on @osanseviero feedback:

take snippets from https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/snippets => will be the same as for https://huggingface.co/black-forest-labs/FLUX.1-dev?inference_api=true
added Javascript / cURL as well
hard-coded small description (not taken from an API)
added link for https://huggingface.co/models?inference=warm&pipeline_tag=text-to-image&sort=trending

osanseviero

Cool stuff! 🔥

osanseviero · 2024-08-20T16:22:47Z

docs/api-inference/tasks/text-to-image.md

+
+- [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev): one of the most powerful image generation models that can generate realistic outputs.
+- [latent-consistency/lcm-lora-sdxl](https://huggingface.co/latent-consistency/lcm-lora-sdxl): a powerful yet fast image generation model.
+- [Kwai-Kolors/Kolors](https://huggingface.co/Kwai-Kolors/Kolors): text-to-image model for photorealistic generation.


This model is frozen. I think it's ok for now but let's consider filtering for only warm/cold models in the future

osanseviero · 2024-08-20T16:39:55Z

docs/api-inference/tasks/text-to-image.md

+- [Kwai-Kolors/Kolors](https://huggingface.co/Kwai-Kolors/Kolors): text-to-image model for photorealistic generation.
+- [stabilityai/stable-diffusion-3-medium-diffusers](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers): a powerful text-to-image model.
+
+This is only a subset of the supported models. Find the model that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=text-to-image&sort=trending).


Opened an internal issue so we can do an OR of warm and cold

osanseviero · 2024-08-20T16:42:29Z

docs/api-inference/tasks/text-to-image.md

+| **inputs** | _string, required_ | The input text data (sometimes called "prompt"). |
+| **parameters.guidance_scale** | _number, optional_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. |
+| **parameters.negative_prompt[]** | _string, optional_ | FOne or several prompt to guide what NOT to include in image generation. |
+| **parameters.num_inference_steps** | _integer, optional_ | For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |


I'm not a fan of the bunch of parameters. .... Let's think if we can make something that keeps a clear difference while not being so repetitive

for the record, I tried using nested lists

- **inputs** (_string, required_): The input text data (sometimes called "prompt"). - **parameters** (_object, optional_): - **guidance_scale** (_number, optional_): For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. - **negative_prompt[]** (_string or string[], optional_): One or several prompts to guide what NOT to include in image generation. - **num_inference_steps** (_integer, optional_): For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. - **target_size** (_object, optional_): - **width** (_integer, optional_): The size in pixels of the output image. - **height** (_integer, optional_): The size in pixels of the output image. - **scheduler** (_string, optional_): For diffusion models. Override the scheduler with a compatible one.

resulting in

also tried nested table like this, which I'm really not a fan (wasted space)

| Payload | | | | :--- | :--- | :--- | :--- | :--- | | **inputs** | | | _string, required_ | The input text data (sometimes called "prompt"). | | **parameters** | | | | | | | **guidance_scale** || _number, optional_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. | | | **negative_prompt[]** | | _string, optional_ | One or several prompt to guide what NOT to include in image generation. | | | **num_inference_steps** | | _integer, optional_ | For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. | | | **target_size** || | | | | | **width** | _integer, optional_ | The size in pixel of the output image. | | | | **height** | _integer, optional_ | The size in pixel of the output image. | | | **scheduler** | | _string, optional_ | For diffusion models. Override the scheduler with a compatible one. |

Let's go with the last solution. It's ugly markdown-wise but the table looks ok. Can be changed in the future.

osanseviero · 2024-08-20T16:43:10Z

docs/api-inference/tasks/text-to-image.md

+| :--- | :--- | :--- |
+| **inputs** | _string, required_ | The input text data (sometimes called "prompt"). |
+| **parameters.guidance_scale** | _number, optional_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. |
+| **parameters.negative_prompt[]** | _string, optional_ | FOne or several prompt to guide what NOT to include in image generation. |


The typing seems a bit off for me. Is it always an array of strings (even if it's just one?)

osanseviero · 2024-08-20T16:46:09Z

docs/api-inference/tasks/text-to-image.md

+| **inputs** | _string, required_ | The input text data (sometimes called "prompt"). |
+| **parameters.guidance_scale** | _number, optional_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. |
+| **parameters.negative_prompt[]** | _string, optional_ | FOne or several prompt to guide what NOT to include in image generation. |
+| **parameters.num_inference_steps** | _integer, optional_ | For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |


To check if all of these already work well out of the box

docs/api-inference/tasks/text-to-image.md

osanseviero · 2024-08-20T16:47:30Z

docs/api-inference/tasks/text-to-image.md

+| **parameters.target_size.height** | _integer, optional_ | The size in pixel of the output image. |
+| **parameters.scheduler** | _string, optional_ | For diffusion models. Override the scheduler with a compatible one. |
+
+| Headers |   |    |


Sounds good, I would also document this in docs/api-inference/task_parameters.md maybe as it's a general parameter for all models

docs/api-inference/tasks/text-to-image.md

Co-authored-by: Omar Sanseviero <[email protected]>

* init project * first script to generate task pages * commit generated content * generate payload table as well * so undecisive * hey * better ? * Add image-to-image page * template for snippets section + few things * few things

Wauplin · 2024-08-23T13:31:11Z

With #1386 being merged, we have a clean first part to merge into the new_api_docs now. Let's not forget the few TODOs there.

osanseviero

Nice 🔥

osanseviero · 2024-08-23T14:08:46Z

docs/api-inference/tasks/image_to_image.md

+
+</Tip>
+
+### Recommended models


As discussed before, we'll need to filter out models that are not warm. The one I would have here is https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0

I thought twice about it and I don't know how we can do that. The inference=warm status can change over time but since the docs are static once generated, it might not be accurate when a user visits the page.

I pushed d60825f to fetch the inference status of each model. For now I haven't changed the templates for the "Recommended models" part. For text-to-image task it would be ok but for image-to-image task there are no warm models in the suggested ones.

maybe https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0 should be listed on https://huggingface.co/tasks/image-to-image but that seems like a short term solution

https://github.com/huggingface/hub-docs/pull/1396/files

osanseviero · 2024-08-23T14:10:40Z

docs/api-inference/tasks/text_to_image.md

+| :--- | :--- | :--- |
+| **inputs** | _string, required_ | The input text data (sometimes called "prompt" |
+| **parameters** | _object, optional_ | Additional inference parameters for Text To Image |
+| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;guidance_scale** | _number, optional_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. |


Is the "diffusion models" clarification important? Only diffusers supports this task

🤷‍♂️

Not really no. I don't know where the original description comes from but we can update the specs in huggingface.js yes

scripts/api-inference/templates/specs_headers.handlebars

osanseviero · 2024-08-23T14:19:57Z

scripts/api-inference/scripts/generate.ts

@@ -0,0 +1,301 @@
+import { snippets, PipelineType } from "@huggingface/tasks";


I'm skipping this file for now

Co-authored-by: Omar Sanseviero <[email protected]>

Wauplin · 2024-08-27T13:35:14Z

As discussed offline, let's merge.

* First draft for text-to-image * add correct code snippets * Update docs/api-inference/tasks/text-to-image.md Co-authored-by: Omar Sanseviero <[email protected]> * better table? * Generate tasks pages from script (#1386) * init project * first script to generate task pages * commit generated content * generate payload table as well * so undecisive * hey * better ? * Add image-to-image page * template for snippets section + few things * few things * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * generate * fetch inference status --------- Co-authored-by: Omar Sanseviero <[email protected]>

* Add draft of docs structure * Add index page * Prepare overview and rate limits * Manage redirects * Clean up * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * Apply suggestions from review * Add additional headers * Apply suggestions from code review Co-authored-by: Lucain <[email protected]> * Incorporate reviewer's feedback * First draft for text-to-image, image-to-image + generate script (#1384) * First draft for text-to-image * add correct code snippets * Update docs/api-inference/tasks/text-to-image.md Co-authored-by: Omar Sanseviero <[email protected]> * better table? * Generate tasks pages from script (#1386) * init project * first script to generate task pages * commit generated content * generate payload table as well * so undecisive * hey * better ? * Add image-to-image page * template for snippets section + few things * few things * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * generate * fetch inference status --------- Co-authored-by: Omar Sanseviero <[email protected]> * Add getting started * Add draft of docs structure * Add index page * Prepare overview and rate limits * Manage redirects * Clean up * Apply suggestions from review * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * Add additional headers * Apply suggestions from code review Co-authored-by: Lucain <[email protected]> * Incorporate reviewer's feedback * First draft for text-to-image, image-to-image + generate script (#1384) * First draft for text-to-image * add correct code snippets * Update docs/api-inference/tasks/text-to-image.md Co-authored-by: Omar Sanseviero <[email protected]> * better table? * Generate tasks pages from script (#1386) * init project * first script to generate task pages * commit generated content * generate payload table as well * so undecisive * hey * better ? * Add image-to-image page * template for snippets section + few things * few things * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * Update scripts/api-inference/templates/specs_headers.handlebars Co-authored-by: Omar Sanseviero <[email protected]> * generate * fetch inference status --------- Co-authored-by: Omar Sanseviero <[email protected]> * Add getting started * Update docs/api-inference/getting_started.md Co-authored-by: Lucain <[email protected]> * Draft to add text-generation parameters (#1393) * first draft to add text-generation parameters * headers * more structure * add chat-completion * better handling of arrays * better handling of parameters * Add new tasks pages (fill mask, summarization, question answering, sentence similarity) (#1394) * add fill mask * add summarization * add question answering * Table question answering * handle array output * Add sentence similarity * text classification (almost) * better with an enum * Add mask token * capitalize * remove sentence-similarity * Update docs/api-inference/tasks/table_question_answering.md Co-authored-by: Omar Sanseviero <[email protected]> --------- Co-authored-by: Omar Sanseviero <[email protected]> * mention chat completion in text generation docs * fix chat completion snippets --------- Co-authored-by: Omar Sanseviero <[email protected]> * Filter out frozen models from API docs for tasks (#1396) * Filter out frozen models * use placeholder * New api docs suggestions (#1397) * show as diff * reorder toctree * wording update * diff * Add comment header on each task page (#1400) * Add comment header on each task page * add huggingface.co/api/tasks * Add even more tasks: token classification, translation and zero shot classification (#1398) * Add token classification * add translation task * add zero shot classification * more parameters * More tasks more tasks more tasks! (#1399) * add ASR * fix early stopping parameter * regenrate * add audio_classification * Image classification * Object detection * image segementation * unknown when we don't know * gen * feature extraction * update * regenerate * pull from main * coding style * Update _redirects.yml * Rename all tasks '_' to '-' (#1405) * Rename all tasks '_' to '-' * also for other urls * Update docs/api-inference/index.md Co-authored-by: Victor Muštar <[email protected]> * Apply feedback for "new_api_docs" (#1408) * Update getting started examples * Move snippets above specification * custom link for finegrained token * Fixes new docs (#1413) * Misc changes * Wrap up * Apply suggestions from code review * generate * Add todos to avoid forgetting about them --------- Co-authored-by: Lucain <[email protected]> Co-authored-by: Wauplin <[email protected]> --------- Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Lucain <[email protected]> Co-authored-by: Wauplin <[email protected]> Co-authored-by: Victor Muštar <[email protected]>

First draft for text-to-image

0ef46ad

Wauplin requested a review from osanseviero August 20, 2024 14:25

add correct code snippets

949a4a7

osanseviero reviewed Aug 20, 2024

View reviewed changes

Wauplin and others added 4 commits August 21, 2024 16:12

Update docs/api-inference/tasks/text-to-image.md

2b5af74

Co-authored-by: Omar Sanseviero <[email protected]>

better table?

1f2f4ff

Merge branch 'new_api_docs' into add-text-to-image-example

a981e9f

Generate tasks pages from script (#1386)

f6b31a8

* init project * first script to generate task pages * commit generated content * generate payload table as well * so undecisive * hey * better ? * Add image-to-image page * template for snippets section + few things * few things

Wauplin changed the title ~~First draft for text-to-image~~ First draft for text-to-image, image-to-image + generate script Aug 23, 2024

Wauplin requested a review from osanseviero August 23, 2024 13:31

osanseviero reviewed Aug 23, 2024

View reviewed changes

Wauplin and others added 5 commits August 23, 2024 17:43

Update scripts/api-inference/templates/specs_headers.handlebars

b0565a0

Co-authored-by: Omar Sanseviero <[email protected]>

Update scripts/api-inference/templates/specs_headers.handlebars

e656b19

Co-authored-by: Omar Sanseviero <[email protected]>

generate

0b91a04

fetch inference status

d60825f

Merge branch 'new_api_docs' into add-text-to-image-example

dddffe3

Wauplin merged commit 12ba289 into new_api_docs Aug 27, 2024
1 check passed

Wauplin deleted the add-text-to-image-example branch August 27, 2024 13:35

This was referenced Aug 28, 2024

Draft to add text-generation parameters #1393

Merged

Filter out frozen models from API docs for tasks #1396

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First draft for text-to-image, image-to-image + generate script #1384

First draft for text-to-image, image-to-image + generate script #1384

Wauplin commented Aug 20, 2024 •

edited

Loading

osanseviero commented Aug 20, 2024

Wauplin commented Aug 20, 2024

osanseviero left a comment

osanseviero Aug 20, 2024

osanseviero Aug 20, 2024

osanseviero Aug 20, 2024

Wauplin Aug 21, 2024

Wauplin Aug 21, 2024 •

edited

Loading

Wauplin Aug 21, 2024

Wauplin Aug 21, 2024

Wauplin Aug 21, 2024

osanseviero Aug 20, 2024

osanseviero Aug 20, 2024

osanseviero Aug 20, 2024

Wauplin commented Aug 23, 2024

osanseviero left a comment

osanseviero Aug 23, 2024

Wauplin Aug 23, 2024

Wauplin Aug 23, 2024

Wauplin Aug 23, 2024

Wauplin Aug 28, 2024

osanseviero Aug 23, 2024

Wauplin Aug 23, 2024

osanseviero Aug 23, 2024

Wauplin commented Aug 27, 2024

		@@ -0,0 +1,301 @@
		import { snippets, PipelineType } from "@huggingface/tasks";


		</Tip>

		### Recommended models

First draft for text-to-image, image-to-image + generate script #1384

First draft for text-to-image, image-to-image + generate script #1384

Conversation

Wauplin commented Aug 20, 2024 • edited Loading

osanseviero commented Aug 20, 2024

Wauplin commented Aug 20, 2024

osanseviero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin commented Aug 23, 2024

osanseviero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin commented Aug 27, 2024

Wauplin commented Aug 20, 2024 •

edited

Loading

Wauplin Aug 21, 2024 •

edited

Loading