Skip to content

Commit

Permalink
feat: Serving Gemma 2 with multiple LoRA adapters with Text Generatio…
Browse files Browse the repository at this point in the history
…n Inference (TGI) on Vertex AI notebook (#1586)

# Description

This notebook showcases how to deploy Gemma 2 2B from the Hugging Face
Hub with multiple LoRA adapters fine-tuned for different purposes such
as coding, or SQL using HuggingFace's Text Generation Inference (TGI)
Deep Learning Container (DLC) in combination with a [custom
handler](https://huggingface.co/docs/inference-endpoints/en/guides/custom_handler#create-custom-inference-handler)
on Vertex AI.

---------

Co-authored-by: Holt Skinner <[email protected]>
Co-authored-by: Holt Skinner <[email protected]>
  • Loading branch information
3 people authored Jan 9, 2025
1 parent bd6f555 commit 924c851
Show file tree
Hide file tree
Showing 3 changed files with 1,548 additions and 2 deletions.
8 changes: 6 additions & 2 deletions .github/actions/spelling/allow.txt
Original file line number Diff line number Diff line change
Expand Up @@ -779,6 +779,7 @@ getdata
getexif
getparent
gfile
gguf
gidiyor
github
gitleaks
Expand Down Expand Up @@ -893,6 +894,7 @@ lru
lsb
lxml
lycra
magicoder
magika
mahut
makeover
Expand Down Expand Up @@ -960,6 +962,7 @@ ngrams
nlp
nmade
nmilitary
nmy
noabe
nobserved
nodularis
Expand Down Expand Up @@ -1031,6 +1034,7 @@ projectid
proname
protobuf
pstotext
pth
pubmed
pubspec
putalpha
Expand Down Expand Up @@ -1208,16 +1212,16 @@ wdir
weaviate
webcam
webclient
webfonts
webpage
webpages
webfonts
webrtc
websites
weightage
welcom
werden
whatsapp
wght
whatsapp
wiffle
wikipedia
wil
Expand Down
5 changes: 5 additions & 0 deletions open-models/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,16 @@ This repository contains examples for deploying and fine-tuning open source mode
- [serving/cloud_run_ollama_gemma2_rag_qa.ipynb](./serving/cloud_run_ollama_gemma2_rag_qa.ipynb) - This notebooks provides steps and code to deploy an open source RAG pipeline to Cloud Run using Ollama and the Gemma 2 model.
- [serving/vertex_ai_text_generation_inference_gemma.ipynb](./serving/vertex_ai_text_generation_inference_gemma.ipynb) - This notebooks provides steps and code to deploy Google Gemma with the Hugging Face DLC for Text Generation Inference (TGI) on Vertex AI.
- [serving/vertex_ai_pytorch_inference_paligemma_with_custom_handler.ipynb](./serving/vertex_ai_pytorch_inference_paligemma_with_custom_handler.ipynb) - This notebooks provides steps and code to deploy Google PaliGemma with the Hugging Face Python Inference DLC using a custom handler on Vertex AI.
- [serving/vertex_ai_tgi_gemma_multi_lora_adapters_deployment.ipynb](./serving/vertex_ai_tgi_gemma_multi_lora_adapters_deployment.ipynb) - This notebook showcases how to deploy Gemma 2 from the Hugging Face Hub with multiple LoRA adapters fine-tuned for different purposes such as coding, or SQL using Hugging Face's Text Generation Inference (TGI) Deep Learning Container (DLC) in combination with a custom handler on Vertex AI.

### Fine-tuning

- [fine-tuning/vertex_ai_trl_fine_tuning_gemma.ipynb](./fine-tuning/vertex_ai_trl_fine_tuning_gemma.ipynb) - This notebooks provides steps and code to fine-tune Google Gemma with TRL via the Hugging Face PyTorch DLC for Training on Vertex AI.

### Evaluation

- [evaluation/vertex_ai_tgi_gemma_with_genai_evaluation.ipynb](./evaluation/vertex_ai_tgi_gemma_with_genai_evaluation.ipynb) - This notebooks provides steps and code to use the Vertex AI Gen AI Evaluation framework to evaluate Gemma 2 in a summarization task.

### Use cases

- [use-cases/guess_app.ipynb](./use-cases/guess_app.ipynb) - This notebook shows how to build a "Guess Who or What" app using FLUX and Gemini.
Loading

0 comments on commit 924c851

Please sign in to comment.