Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Huggingface Serverless inference #453

Open
haesleinhuepf opened this issue Jan 4, 2025 · 4 comments · May be fixed by #454
Open

Compatibility with Huggingface Serverless inference #453

haesleinhuepf opened this issue Jan 4, 2025 · 4 comments · May be fixed by #454

Comments

@haesleinhuepf
Copy link
Owner

git-bob comment on how to make a new endpoint in this reppsitory compatible with Huggingface's Serverless Inference. Use llama 3.3 70b as default model for this.

If you don't know how to do this, no problem. Just say that and I'll search for example code.

Copy link
Contributor

github-actions bot commented Jan 4, 2025

This message was generated by git-bob (version: 0.17.0, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

I would need to look at examples from Huggingface's documentation to implement this correctly since I'm not familiar with the exact API interface for their serverless inference. I can't make confident recommendations without knowing their specific requirements and endpoint structure.

Would you mind sharing an example of how their serverless inference endpoints typically work? Then I can help adapt the codebase to make it compatible.

@haesleinhuepf
Copy link
Owner Author

You find an example how to use the API here: https://github.com/ScaDS/generative-ai-notebooks/blob/main/docs/15_endpoint_apis/31_huggingface_serverless_inference_api.ipynb

git-bob think about this

Copy link
Contributor

github-actions bot commented Jan 4, 2025

This message was generated by git-bob (version: 0.17.0, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

Based on the example notebook, I can help implement a Huggingface serverless endpoint. Here's a proposed addition to _endpoints.py:

def prompt_huggingface(message: str, model="meta-llama/Llama-2-70b-chat-hf"):
    """A prompt helper function that uses Huggingface's serverless inference API"""
    from huggingface_hub import InferenceClient
    import os
    
    client = InferenceClient(token=os.environ["HF_TOKEN"])
    
    return client.text_generation(
        message,
        model=model,
        temperature=0.7,
        top_p=0.95,
        stop_sequences=["\n\n"]
    )

You'll need to set the HF_TOKEN environment variable with your Huggingface API token to use this.

@haesleinhuepf
Copy link
Owner Author

Ok, git-bob implement this, add a respective elif-block to the _terminal.py and add appropriate information to the readme.

@github-actions github-actions bot linked a pull request Jan 4, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant