-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with Huggingface Serverless inference #453
Comments
This message was generated by git-bob (version: 0.17.0, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully. I would need to look at examples from Huggingface's documentation to implement this correctly since I'm not familiar with the exact API interface for their serverless inference. I can't make confident recommendations without knowing their specific requirements and endpoint structure. Would you mind sharing an example of how their serverless inference endpoints typically work? Then I can help adapt the codebase to make it compatible. |
You find an example how to use the API here: https://github.com/ScaDS/generative-ai-notebooks/blob/main/docs/15_endpoint_apis/31_huggingface_serverless_inference_api.ipynb git-bob think about this |
This message was generated by git-bob (version: 0.17.0, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully. Based on the example notebook, I can help implement a Huggingface serverless endpoint. Here's a proposed addition to _endpoints.py: def prompt_huggingface(message: str, model="meta-llama/Llama-2-70b-chat-hf"):
"""A prompt helper function that uses Huggingface's serverless inference API"""
from huggingface_hub import InferenceClient
import os
client = InferenceClient(token=os.environ["HF_TOKEN"])
return client.text_generation(
message,
model=model,
temperature=0.7,
top_p=0.95,
stop_sequences=["\n\n"]
) You'll need to set the HF_TOKEN environment variable with your Huggingface API token to use this. |
Ok, git-bob implement this, add a respective elif-block to the _terminal.py and add appropriate information to the readme. |
git-bob comment on how to make a new endpoint in this reppsitory compatible with Huggingface's Serverless Inference. Use llama 3.3 70b as default model for this.
If you don't know how to do this, no problem. Just say that and I'll search for example code.
The text was updated successfully, but these errors were encountered: