Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Putting the provider arg more front'n'center (and other tweaks) #1114

Merged
merged 3 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 14 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ await uploadFile({
}
});

// Use HF Inference API
// Use HF Inference API, or external Inference Providers!

await inference.chatCompletion({
model: "meta-llama/Llama-3.1-8B-Instruct",
Expand All @@ -39,6 +39,7 @@ await inference.chatCompletion({
],
max_tokens: 512,
temperature: 0.5,
provider: "sambanova", // or together, fal-ai, replicate, …
});

await inference.textToImage({
Expand Down Expand Up @@ -146,16 +147,16 @@ for await (const chunk of inference.chatCompletionStream({

/// Using a third-party provider:
await inference.chatCompletion({
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
max_tokens: 512,
provider: "sambanova"
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
max_tokens: 512,
provider: "sambanova", // or together, fal-ai, replicate, …
})

await inference.textToImage({
model: "black-forest-labs/FLUX.1-dev",
inputs: "a picture of a green bird",
provider: "together"
model: "black-forest-labs/FLUX.1-dev",
inputs: "a picture of a green bird",
provider: "fal-ai",
})


Expand All @@ -169,14 +170,10 @@ await inference.translation({
},
});

await inference.textToImage({
model: 'black-forest-labs/FLUX.1-dev',
inputs: 'a picture of a green bird',
})

// pass multimodal files or URLs as inputs
await inference.imageToText({
model: 'nlpconnect/vit-gpt2-image-captioning',
data: await (await fetch('https://picsum.photos/300/300')).blob(),
model: 'nlpconnect/vit-gpt2-image-captioning',
})

// Using your own dedicated inference endpoint: https://hf.co/docs/inference-endpoints/
Expand All @@ -188,9 +185,9 @@ const llamaEndpoint = inference.endpoint(
"https://api-inference.huggingface.co/models/meta-llama/Llama-3.1-8B-Instruct"
);
const out = await llamaEndpoint.chatCompletion({
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
max_tokens: 512,
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
max_tokens: 512,
});
console.log(out.choices[0].message);
```
Expand Down
10 changes: 5 additions & 5 deletions packages/inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,15 @@ const hf = new HfInference('your access token')

Your access token should be kept private. If you need to protect it in front-end applications, we suggest setting up a proxy server that stores the access token.

### Requesting third-party inference providers
### Third-party inference providers

You can request inference from third-party providers with the inference client.
You can send inference requests to third-party providers with the inference client.

Currently, we support the following providers: [Fal.ai](https://fal.ai), [Replicate](https://replicate.com), [Together](https://together.xyz) and [Sambanova](https://sambanova.ai).

To make request to a third-party provider, you have to pass the `provider` parameter to the inference function. Make sure your request is authenticated with an access token.
To send requests to a third-party provider, you have to pass the `provider` parameter to the inference function. Make sure your request is authenticated with an access token.
```ts
const accessToken = "hf_..."; // Either a HF access token, or an API key from the 3rd party provider (Replicate in this example)
const accessToken = "hf_..."; // Either a HF access token, or an API key from the third-party provider (Replicate in this example)

const client = new HfInference(accessToken);
await client.textToImage({
Expand All @@ -63,7 +63,7 @@ await client.textToImage({
When authenticated with a Hugging Face access token, the request is routed through https://huggingface.co.
When authenticated with a third-party provider key, the request is made directly against that provider's inference API.

Only a subset of models are supported when requesting 3rd party providers. You can check the list of supported models per pipeline tasks here:
Only a subset of models are supported when requesting third-party providers. You can check the list of supported models per pipeline tasks here:
- [Fal.ai supported models](./src/providers/fal-ai.ts)
- [Replicate supported models](./src/providers/replicate.ts)
- [Sambanova supported models](./src/providers/sambanova.ts)
Expand Down
Loading