Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred when using a custom model; please help me with this. #1190

Open
minicoco opened this issue Jan 18, 2025 · 9 comments
Open

Comments

@minicoco
Copy link

minicoco commented Jan 18, 2025

hello my friend, next is my question description, i'm try many methods, but it's still not work, give me any idea ,thank you very much.

I referred to this article: 'https://docs.getwren.ai/oss/installation/custom_llm#running-wren-ai-with-your-custom-llm-embedder-or-document-store

I am use ollama+llama2。

This is my ollama run command:
docker run -d --network wrenai_wren -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
I am used postman to request api:

Image

This is my config.yaml:

#ubuntu@ubun:~/.wrenai$ cat config.yaml 
type: llm
provider: litellm_llm
timeout: 120
models:
- kwargs:
    n: 1
    temperature: 0
    response_format:
      type: json_object
  # please replace with your model name here, should be lm_studio/<MODEL_NAME>
  model: lm_studio/llama2
  api_base: http://172.20.154.233:11434/v1
  api_key_name: LLM_LM_STUDIO_API_KEY

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text
    dimension: 768
url: http://172.20.154.233:11434
timeout: 120

---
type: engine
provider: wren_ui
endpoint: http://wren-ui:3000

---
type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768
timeout: 120
recreate_index: true

---
type: pipeline
pipes:
  - name: db_schema_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: historical_question_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: table_description_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: db_schema_retrieval
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: historical_question_retrieval
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: sql_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_correction
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: followup_sql_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_summary
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_answer
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_breakdown
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_expansion
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_explanation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_regeneration
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: semantics_description
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: relationship_recommendation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: question_recommendation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: intent_classification
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: data_assistance
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_pairs_preparation
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_pairs_deletion
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large 
  - name: sql_pairs_retrieval
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: preprocess_sql_data
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_executor
    engine: wren_ui
  - name: chart_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: chart_adjustment
    llm: litellm_llm.gpt-4o-mini-2024-07-18
---
settings:
  column_indexing_batch_size: 50
  table_retrieval_size: 10
  table_column_retrieval_size: 100
  allow_using_db_schemas_without_pruning: false
  query_cache_maxsize: 1000
  query_cache_ttl: 3600
  langfuse_host: https://cloud.langfuse.com
  langfuse_enable: true
  logging_level: DEBUG
  development: false

This is my .env file content:

ubuntu@ubun:~/.wrenai$ cat .env
COMPOSE_PROJECT_NAME=wrenai
PLATFORM=linux/amd64

PROJECT_DIR=.

# service port
WREN_ENGINE_PORT=8080
WREN_ENGINE_SQL_PORT=7432
WREN_AI_SERVICE_PORT=5555
WREN_UI_PORT=3000
IBIS_SERVER_PORT=8000
WREN_UI_ENDPOINT=http://wren-ui:${WREN_UI_PORT}

# ai service settings
QDRANT_HOST=qdrant
SHOULD_FORCE_DEPLOY=1

# vendor keys
LLM_OPENAI_API_KEY=
EMBEDDER_OPENAI_API_KEY=
LLM_AZURE_OPENAI_API_KEY=
EMBEDDER_AZURE_OPENAI_API_KEY=
QDRANT_API_KEY=

# version
# CHANGE THIS TO THE LATEST VERSION
WREN_PRODUCT_VERSION=0.14.0
WREN_ENGINE_VERSION=0.13.1
WREN_AI_SERVICE_VERSION=0.14.0
IBIS_SERVER_VERSION=0.13.1
WREN_UI_VERSION=0.19.1
WREN_BOOTSTRAP_VERSION=0.1.5

# user id (uuid v4)
USER_UUID=

# for other services
POSTHOG_API_KEY=phc_nhF32aj4xHXOZb0oqr2cn4Oy9uiWzz6CCP4KZmRq9aE
POSTHOG_HOST=https://app.posthog.com
TELEMETRY_ENABLED=true
# this is for telemetry to know the model, i think ai-service might be able to provide a endpoint to get the information
GENERATION_MODEL=gpt-4o-mini
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=

# the port exposes to the host
# OPTIONAL: change the port if you have a conflict
HOST_PORT=32333
AI_SERVICE_FORWARD_PORT=5555

# Wren UI
EXPERIMENTAL_ENGINE_RUST_VERSION=false

LLM_LM_STUDIO_API_KEY=abxdxxddxedex
EMBEDDER_OLLAMA_URL=http://172.20.154.233:11434

This is is my wrenai-wren-ai-service-1' error log:

Waiting for qdrant to start...
qdrant has started.
Waiting for wren-ai-service to start...
INFO:     Started server process [8]
INFO:     Waiting for application startup.
I0118 03:52:26.352 8 wren-ai-service:42] Imported Provider: src.providers.document_store
I0118 03:52:26.507 8 wren-ai-service:66] Registering provider: openai_embedder
I0118 03:52:26.507 8 wren-ai-service:66] Registering provider: qdrant
I0118 03:52:26.507 8 wren-ai-service:42] Imported Provider: src.providers.document_store.qdrant
I0118 03:52:26.507 8 wren-ai-service:42] Imported Provider: src.providers.embedder
I0118 03:52:26.507 8 wren-ai-service:66] Registering provider: azure_openai_embedder
I0118 03:52:26.507 8 wren-ai-service:42] Imported Provider: src.providers.embedder.azure_openai
I0118 03:52:26.508 8 wren-ai-service:66] Registering provider: ollama_embedder
I0118 03:52:26.508 8 wren-ai-service:42] Imported Provider: src.providers.embedder.ollama
I0118 03:52:26.508 8 wren-ai-service:42] Imported Provider: src.providers.embedder.openai
I0118 03:52:26.508 8 wren-ai-service:42] Imported Provider: src.providers.engine
I0118 03:52:26.508 8 wren-ai-service:66] Registering provider: wren_ui
I0118 03:52:26.508 8 wren-ai-service:66] Registering provider: wren_ibis
I0118 03:52:26.508 8 wren-ai-service:66] Registering provider: wren_engine
I0118 03:52:26.508 8 wren-ai-service:42] Imported Provider: src.providers.engine.wren
I0118 03:52:26.508 8 wren-ai-service:42] Imported Provider: src.providers.llm
I0118 03:52:26.511 8 wren-ai-service:66] Registering provider: azure_openai_llm
I0118 03:52:26.511 8 wren-ai-service:42] Imported Provider: src.providers.llm.azure_openai
/app/.venv/lib/python3.12/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
  warnings.warn(message, UserWarning)
I0118 03:52:27.207 8 wren-ai-service:66] Registering provider: litellm_llm
I0118 03:52:27.207 8 wren-ai-service:42] Imported Provider: src.providers.llm.litellm
I0118 03:52:27.208 8 wren-ai-service:66] Registering provider: ollama_llm
I0118 03:52:27.208 8 wren-ai-service:42] Imported Provider: src.providers.llm.ollama
I0118 03:52:27.228 8 wren-ai-service:66] Registering provider: openai_llm
I0118 03:52:27.228 8 wren-ai-service:42] Imported Provider: src.providers.llm.openai
I0118 03:52:27.228 8 wren-ai-service:42] Imported Provider: src.providers.loader
I0118 03:52:27.228 8 wren-ai-service:18] initializing provider: ollama_embedder
I0118 03:52:27.228 8 wren-ai-service:93] Getting provider: ollama_embedder from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
I0118 03:52:27.242 8 wren-ai-service:109] Pulling Ollama model nomic-embed-text
I0118 03:52:28.163 8 wren-ai-service:116] Pulling Ollama model nomic-embed-text: 100%
I0118 03:52:28.164 8 wren-ai-service:180] Using Ollama Embedding Model: nomic-embed-text
I0118 03:52:28.164 8 wren-ai-service:181] Using Ollama URL: http://172.20.154.233:11434
I0118 03:52:28.164 8 wren-ai-service:18] initializing provider: litellm_llm
I0118 03:52:28.164 8 wren-ai-service:93] Getting provider: litellm_llm from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
I0118 03:52:28.164 8 wren-ai-service:18] initializing provider: qdrant
I0118 03:52:28.164 8 wren-ai-service:93] Getting provider: qdrant from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
I0118 03:52:28.164 8 wren-ai-service:370] Using Qdrant Document Store with Embedding Model Dimension: 768
I0118 03:52:28.864 8 wren-ai-service:370] Using Qdrant Document Store with Embedding Model Dimension: 768
I0118 03:52:29.555 8 wren-ai-service:370] Using Qdrant Document Store with Embedding Model Dimension: 768
I0118 03:52:30.235 8 wren-ai-service:18] initializing provider: wren_ui
I0118 03:52:30.235 8 wren-ai-service:93] Getting provider: wren_ui from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
I0118 03:52:30.235 8 wren-ai-service:24] Using Engine: wren_ui
ERROR:    Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/src/__main__.py", line 30, in lifespan
    app.state.service_container = create_service_container(pipe_components, settings)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/globals.py", line 61, in create_service_container
    "semantics_description": generation.SemanticsDescription(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/pipelines/generation/semantics_description.py", line 197, in __init__
    "generator": llm_provider.get_generator(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get_generator'

ERROR:    Application startup failed. Exiting.
@cyyeh
Copy link
Member

cyyeh commented Jan 18, 2025

@minicoco you should replace values of llm and embedder in the pipes section of config.yaml

The format is <provider_name>.<model_name>. For example, litellm_llm.lm_studio/llama2

@minicoco
Copy link
Author

@cyyeh Sorry, I tried the following combinations, but I still have issues:
1.litellm_llm.lm_studio/llama2
2.ollama/llama2
3.llama2
4.ollama.llama2

I obtained the llama2 model through the following command:

ollama pull llama2

and now my model list look like :

Image

Additionally, you mentioned that the embedder also needs to be changed to the format <provider_name>.<model_name>? I'm currently using model: nomic-embed-text, and it seems to work fine when starting up. My current configuration is as follows:

type: llm
provider: litellm_llm
timeout: 120
models:
- kwargs:
    n: 1
    temperature: 0
    response_format:
      type: json_object
  # please replace with your model name here, should be lm_studio/<MODEL_NAME>
  model: ollama/llama2
  api_base: http://172.20.154.233:11434/v1
  api_key_name: LLM_LM_STUDIO_API_KEY

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text
    dimension: 768
url: http://172.20.154.233:11434
timeout: 120

and error is the same:

I0120 01:55:38.642 8 wren-ai-service:93] Getting provider: wren_ui from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
I0120 01:55:38.642 8 wren-ai-service:24] Using Engine: wren_ui
ERROR:    Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/src/__main__.py", line 30, in lifespan
    app.state.service_container = create_service_container(pipe_components, settings)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/globals.py", line 61, in create_service_container
    "semantics_description": generation.SemanticsDescription(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/pipelines/generation/semantics_description.py", line 197, in __init__
    "generator": llm_provider.get_generator(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get_generator'

ERROR:    Application startup failed. Exiting.
Timeout: wren-ai-service did not start within 60 seconds

@kyyz147
Copy link

kyyz147 commented Jan 20, 2025

ME TOO
How to solve it
type: llm
provider: ollama_llm
timeout: 1200
models:


type: engine
provider: wren_ui
endpoint: http://wren-ui:3000

type: embedder
provider: ollama_embedder
models:


type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768
timeout: 1200
recreate_index: true


type: pipeline
pipes:

  • name: db_schema_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: historical_question_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: table_description_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: db_schema_retrieval
    llm: ollama_llm.qwen2
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: historical_question_retrieval
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: sql_generation
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: sql_correction
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: followup_sql_generation
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: sql_summary
    llm: ollama_llm.qwen2
  • name: sql_answer
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: sql_breakdown
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: sql_expansion
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: sql_explanation
    llm: ollama_llm.qwen2
  • name: sql_regeneration
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: semantics_description
    llm: ollama_llm.qwen2
  • name: relationship_recommendation
    llm: ollama_llm.qwen2
    engine: wren_ui
  • name: question_recommendation
    llm: ollama_llm.qwen2
  • name: intent_classification
    llm: ollama_llm.qwen2
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  • name: data_assistance
    llm: litellm_llm.llama-3.2-1b-instruct
  • name: sql_pairs_preparation
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: ollama_llm.qwen2
  • name: sql_pairs_deletion
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
  • name: sql_pairs_retrieval
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: ollama_llm.qwen2
  • name: preprocess_sql_data
    llm: ollama_llm.qwen2
  • name: sql_executor
    engine: wren_ui
  • name: chart_generation
    llm: ollama_llm.qwen2
  • name: chart_adjustment
    llm: ollama_llm.qwen2

settings:
column_indexing_batch_size: 50
table_retrieval_size: 10
table_column_retrieval_size: 100
allow_using_db_schemas_without_pruning: false
query_cache_maxsize: 1000
query_cache_ttl: 3600
langfuse_host: https://cloud.langfuse.com
langfuse_enable: true
logging_level: DEBUG
development: false

@cyyeh
Copy link
Member

cyyeh commented Jan 20, 2025

@minicoco @kyyz147

Could you try to follow the config.yaml examples here and adapt to your usecases first? feel free to reach out to me if there are further issues. Thank you!

https://github.com/Canner/WrenAI/blob/chore/ai-service/add-llm-configs/wren-ai-service/docs/config_examples/config.ollama.yaml

@Liudon
Copy link

Liudon commented Jan 21, 2025

.env

COMPOSE_PROJECT_NAME=wrenai
PLATFORM=linux/amd64

PROJECT_DIR=/root/.wrenai

# service port
WREN_ENGINE_PORT=8080
WREN_ENGINE_SQL_PORT=7432
WREN_AI_SERVICE_PORT=5555
WREN_UI_PORT=3000
IBIS_SERVER_PORT=8000
WREN_UI_ENDPOINT=http://wren-ui:${WREN_UI_PORT}

LLM_PROVIDER=ollama_llm
GENERATION_MODEL=llama3:8b
LLM_OLLAMA_URL=http://127.0.0.1:11434
EMBEDDER_OLLAMA_URL=http://127.0.0.1:11434
LLM_LM_STUDIO_API_KEY="12345"

# ai service settings
QDRANT_HOST=qdrant
SHOULD_FORCE_DEPLOY=1

# vendor keys
LLM_OPENAI_API_KEY=
EMBEDDER_OPENAI_API_KEY=
LLM_AZURE_OPENAI_API_KEY=
EMBEDDER_AZURE_OPENAI_API_KEY=
QDRANT_API_KEY=

# version
# CHANGE THIS TO THE LATEST VERSION
WREN_PRODUCT_VERSION=0.14.0
WREN_ENGINE_VERSION=0.13.1
WREN_AI_SERVICE_VERSION=0.14.0
IBIS_SERVER_VERSION=0.13.1
WREN_UI_VERSION=0.19.1
WREN_BOOTSTRAP_VERSION=0.1.5

# user id (uuid v4)
USER_UUID=

# for other services
POSTHOG_API_KEY=phc_nhF32aj4xHXOZb0oqr2cn4Oy9uiWzz6CCP4KZmRq9aE
POSTHOG_HOST=https://app.posthog.com
TELEMETRY_ENABLED=true
# this is for telemetry to know the model, i think ai-service might be able to provide a endpoint to get the information
GENERATION_MODEL=gpt-4o-mini
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=

# the port exposes to the host
# OPTIONAL: change the port if you have a conflict
HOST_PORT=3000
AI_SERVICE_FORWARD_PORT=5555

# Wren UI
EXPERIMENTAL_ENGINE_RUST_VERSION=false

config.yaml

# you should rename this file to config.yaml and put it in ~/.wrenai
# please pay attention to the comments starting with # and adjust the config accordingly

type: llm
provider: litellm_llm
timeout: 120
models:
- api_base: http://127.0.0.1:11434/v1  # change this to your ollama host, api_base should be <ollama_url>/v1
  api_key_name: LLM_LM_STUDIO_API_KEY
  model: ollama/llama3:8b  # openai/<ollama_model_name>
  kwargs:
    n: 1
    temperature: 0

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text  # put your ollama embedder model name here
    dimension: 768
url: http://127.0.0.1:11434  # change this to your ollama host, url should be <ollama_url>
timeout: 120

---
type: engine
provider: wren_ui
endpoint: http://wren-ui:3000

---
type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768  # put your embedding model dimension here
timeout: 120
recreate_index: false

---
# the format of llm and embedder should be <provider>.<model_name> such as litellm_llm.gpt-4o-2024-08-06
# the pipes may be not the latest version, please refer to the latest version: https://raw.githubusercontent.com/canner/WrenAI/<WRENAI_VERSION_NUMBER>/docker/config.example.yaml
type: pipeline
pipes:
  - name: db_schema_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: table_description_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: db_schema_retrieval
    llm: litellm_llm.ollama/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_retrieval
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: sql_generation
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: sql_correction
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: followup_sql_generation
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: sql_summary
    llm: litellm_llm.ollama/llama3:8b
  - name: sql_answer
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: sql_breakdown
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: sql_expansion
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: sql_explanation
    llm: litellm_llm.ollama/llama3:8b
  - name: sql_regeneration
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: semantics_description
    llm: litellm_llm.ollama/llama3:8b
  - name: relationship_recommendation
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: question_recommendation
    llm: litellm_llm.ollama/llama3:8b
  - name: question_recommendation_db_schema_retrieval
    llm: litellm_llm.ollama/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: question_recommendation_sql_generation
    llm: litellm_llm.ollama/llama3:8b
    engine: wren_ui
  - name: chart_generation
    llm: litellm_llm.ollama/llama3:8b
  - name: chart_adjustment
    llm: litellm_llm.ollama/llama3:8b
  - name: intent_classification
    llm: litellm_llm.ollama/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: data_assistance
    llm: litellm_llm.ollama/llama3:8b
  - name: sql_pairs_indexing
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
  - name: sql_pairs_deletion
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text 
  - name: sql_pairs_retrieval
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: litellm_llm.ollama/llama3:8b
  - name: preprocess_sql_data
    llm: litellm_llm.ollama/llama3:8b
  - name: sql_executor
    engine: wren_ui
  - name: sql_question_generation
    llm: litellm_llm.ollama/llama3:8b
---
settings:
  column_indexing_batch_size: 50
  table_retrieval_size: 10
  table_column_retrieval_size: 100
  allow_using_db_schemas_without_pruning: false
  query_cache_maxsize: 1000
  query_cache_ttl: 3600
  langfuse_host: https://cloud.langfuse.com
  langfuse_enable: true
  logging_level: DEBUG
  development: true

wren-ai-service-1 show error log

I0121 08:51:55.929 8 wren-ai-service:66] Registering provider: litellm_llm
I0121 08:51:55.929 8 wren-ai-service:42] Imported Provider: src.providers.llm.litellm
I0121 08:51:55.931 8 wren-ai-service:66] Registering provider: ollama_llm
I0121 08:51:55.931 8 wren-ai-service:42] Imported Provider: src.providers.llm.ollama
I0121 08:51:55.992 8 wren-ai-service:66] Registering provider: openai_llm
I0121 08:51:55.992 8 wren-ai-service:42] Imported Provider: src.providers.llm.openai
I0121 08:51:55.992 8 wren-ai-service:42] Imported Provider: src.providers.loader
I0121 08:51:55.992 8 wren-ai-service:18] initializing provider: ollama_embedder
I0121 08:51:55.993 8 wren-ai-service:93] Getting provider: ollama_embedder from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
ERROR:    Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 236, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
    self.gen.throw(value)
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/src/__main__.py", line 29, in lifespan
    pipe_components = generate_components(settings.components)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/__init__.py", line 395, in generate_components
    identifier: provider_factory(config)
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/__init__.py", line 19, in provider_factory
    return loader.get_provider(config.get("provider"))(**config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/embedder/ollama.py", line 178, in __init__
    pull_ollama_model(self._url, self._embedding_model)
  File "/src/providers/loader.py", line 107, in pull_ollama_model
    models = [model["name"] for model in client.list()["models"]]
                                         ^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 333, in list
    return self._request('GET', '/api/tags').json()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 69, in _request
    response = self._client.request(method, url, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 837, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 926, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 954, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1027, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 235, in handle_request
    with map_httpcore_exceptions():
  File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
    self.gen.throw(value)
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused

ERROR:    Application startup failed. Exiting.

ollama is running

curl 'http://127.0.0.1:11434'
Ollama is running

curl 'http://127.0.0.1:11434/v1/models'
{"object":"list","data":[{"id":"nomic-embed-text:latest","object":"model","created":1737443888,"owned_by":"library"},{"id":"llama3:8b","object":"model","created":1737443824,"owned_by":"library"}]}

curl 'http://127.0.0.1:11434/api/tags'
{"models":[{"name":"nomic-embed-text:latest","model":"nomic-embed-text:latest","modified_at":"2025-01-21T15:18:08.92930428+08:00","size":274302450,"digest":"0a109f422b47e3a30ba2b10eca18548e944e8a23073ee3f3e947efcf3c45e59f","details":{"parent_model":"","format":"gguf","family":"nomic-bert","families":["nomic-bert"],"parameter_size":"137M","quantization_level":"F16"}},{"name":"llama3:8b","model":"llama3:8b","modified_at":"2025-01-21T15:17:04.92935713+08:00","size":4661224676,"digest":"365c0bd3c000a25d28ddbf732fe1c6add414de7275464c4e4d1c3b5fcb5d8ad1","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"8.0B","quantization_level":"Q4_0"}}]}

@cyyeh
Copy link
Member

cyyeh commented Jan 21, 2025

@Liudon your model name is wrong, as I said in the comment. It should be openai/<model_name>. Also you don't need api_key_name, just put OPENAI_API_KEY=<randon_string> in .env

@Liudon
Copy link

Liudon commented Jan 21, 2025

config.yaml

# you should rename this file to config.yaml and put it in ~/.wrenai
# please pay attention to the comments starting with # and adjust the config accordingly

type: llm
provider: litellm_llm
timeout: 120
models:
- api_base: http://127.0.0.1:11434/v1  # change this to your ollama host, api_base should be <ollama_url>/v1
  model: openai/llama3:8b  # openai/<ollama_model_name>
  kwargs:
    n: 1
    temperature: 0

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text  # put your ollama embedder model name here
    dimension: 768
url: http://127.0.0.1:11434  # change this to your ollama host, url should be <ollama_url>
timeout: 120

---
type: engine
provider: wren_ui
endpoint: http://wren-ui:3000

---
type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768  # put your embedding model dimension here
timeout: 120
recreate_index: false

---
# the format of llm and embedder should be <provider>.<model_name> such as litellm_llm.gpt-4o-2024-08-06
# the pipes may be not the latest version, please refer to the latest version: https://raw.githubusercontent.com/canner/WrenAI/<WRENAI_VERSION_NUMBER>/docker/config.example.yaml
type: pipeline
pipes:
  - name: db_schema_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: table_description_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: db_schema_retrieval
    llm: litellm_llm.openai/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_retrieval
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: sql_generation
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: sql_correction
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: followup_sql_generation
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: sql_summary
    llm: litellm_llm.openai/llama3:8b
  - name: sql_answer
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: sql_breakdown
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: sql_expansion
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: sql_explanation
    llm: litellm_llm.openai/llama3:8b
  - name: sql_regeneration
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: semantics_description
    llm: litellm_llm.openai/llama3:8b
  - name: relationship_recommendation
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: question_recommendation
    llm: litellm_llm.openai/llama3:8b
  - name: question_recommendation_db_schema_retrieval
    llm: litellm_llm.openai/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: question_recommendation_sql_generation
    llm: litellm_llm.openai/llama3:8b
    engine: wren_ui
  - name: chart_generation
    llm: litellm_llm.openai/llama3:8b
  - name: chart_adjustment
    llm: litellm_llm.openai/llama3:8b
  - name: intent_classification
    llm: litellm_llm.openai/llama3:8b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: data_assistance
    llm: litellm_llm.openai/llama3:8b
  - name: sql_pairs_indexing
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
  - name: sql_pairs_deletion
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text 
  - name: sql_pairs_retrieval
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: litellm_llm.openai/llama3:8b
  - name: preprocess_sql_data
    llm: litellm_llm.openai/llama3:8b
  - name: sql_executor
    engine: wren_ui
  - name: sql_question_generation
    llm: litellm_llm.openai/llama3:8b
---
settings:
  column_indexing_batch_size: 50
  table_retrieval_size: 10
  table_column_retrieval_size: 100
  allow_using_db_schemas_without_pruning: false
  query_cache_maxsize: 1000
  query_cache_ttl: 3600
  langfuse_host: https://cloud.langfuse.com
  langfuse_enable: true
  logging_level: DEBUG
  development: true

i change config, but it work failed with the same error.

I0121 11:28:06.781 7 wren-ai-service:18] initializing provider: ollama_embedder
I0121 11:28:06.782 7 wren-ai-service:93] Getting provider: ollama_embedder from {'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
ERROR:    Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 236, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
    self.gen.throw(value)
  File "/app/.venv/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/src/__main__.py", line 29, in lifespan
    pipe_components = generate_components(settings.components)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/__init__.py", line 395, in generate_components
    identifier: provider_factory(config)
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/__init__.py", line 19, in provider_factory
    return loader.get_provider(config.get("provider"))(**config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/providers/embedder/ollama.py", line 178, in __init__
    pull_ollama_model(self._url, self._embedding_model)
  File "/src/providers/loader.py", line 107, in pull_ollama_model
    models = [model["name"] for model in client.list()["models"]]
                                         ^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 333, in list
    return self._request('GET', '/api/tags').json()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 69, in _request
    response = self._client.request(method, url, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 837, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 926, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 954, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1027, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 235, in handle_request
    with map_httpcore_exceptions():
  File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
    self.gen.throw(value)
  File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused

@cyyeh

@Liudon
Copy link

Liudon commented Jan 21, 2025

POST /api/graphql

{
    "data": {
        "modelSync": {
            "status": "UNSYNCRONIZED",
            "__typename": "ModelSyncResponse"
        }
    }
}
Image
[2025-01-21T11:50:42.623] [DEBUG] WrenAIAdaptor - Got error when deploying to wren AI, hash: b3111a0e9ddc421224a428d9ab0169b330ba88d6. Error: connect ECONNREFUSED 172.18.0.5:5555
[2025-01-21T11:50:42.635] [INFO] WrenAIAdaptor - Wren AI: Generating recommendation questions
[2025-01-21T11:50:42.637] [DEBUG] WrenAIAdaptor - Got error when generating recommendation questions: connect ECONNREFUSED 172.18.0.5:5555

@minicoco
Copy link
Author

Here is my finally config,now it's woking, Thank you again and Happy Spring Festival!
Additionally, please be reminded that the server running the custom model must ensure that the GPU drivers are installed and in proper working condition; otherwise, the response will be very slow. If running Ollama in a container, it is also necessary to ensure that the container can properly utilize the GPU.

config.yaml:

type: llm
provider: litellm_llm
timeout: 120
models:
- kwargs:
    n: 1
    temperature: 0
    response_format:
      type: json_object
  # please replace with your model name here, should be lm_studio/<MODEL_NAME>
  model: openai/phi4:14b
  api_base: http://172.20.154.233:11434/v1
  api_key_name: LLM_LM_STUDIO_API_KEY

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text
    dimension: 768
url: http://172.20.154.233:11434
timeout: 120

---
type: engine
provider: wren_ui
endpoint: http://wren-ui:3000

---
type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768
timeout: 120
recreate_index: true

---
type: pipeline
pipes:
  - name: db_schema_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: table_description_indexing
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: db_schema_retrieval
    llm: litellm_llm.openai/phi4:14b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: historical_question_retrieval
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: sql_generation
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: sql_correction
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: followup_sql_generation
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: sql_summary
    llm: litellm_llm.openai/phi4:14b
  - name: sql_answer
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: sql_breakdown
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: sql_expansion
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: sql_explanation
    llm: litellm_llm.openai/phi4:14b
  - name: sql_regeneration
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: semantics_description
    llm: litellm_llm.openai/phi4:14b
  - name: relationship_recommendation
    llm: litellm_llm.openai/phi4:14b
    engine: wren_ui
  - name: question_recommendation
    llm: litellm_llm.openai/phi4:14b
  - name: intent_classification
    llm: litellm_llm.openai/phi4:14b
    embedder: ollama_embedder.nomic-embed-text
    document_store: qdrant
  - name: data_assistance
    llm: litellm_llm.openai/phi4:14b
  - name: sql_pairs_preparation
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: litellm_llm.openai/phi4:14b
  - name: sql_pairs_deletion
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text 
  - name: sql_pairs_retrieval
    document_store: qdrant
    embedder: ollama_embedder.nomic-embed-text
    llm: litellm_llm.openai/phi4:14b
  - name: preprocess_sql_data
    llm: litellm_llm.openai/phi4:14b
  - name: sql_executor
    engine: wren_ui
  - name: chart_generation
    llm: litellm_llm.openai/phi4:14b
  - name: chart_adjustment
    llm: litellm_llm.openai/phi4:14b
---
settings:
  column_indexing_batch_size: 50
  table_retrieval_size: 10
  table_column_retrieval_size: 100
  allow_using_db_schemas_without_pruning: false
  query_cache_maxsize: 1000
  query_cache_ttl: 3600
  langfuse_host: https://cloud.langfuse.com
  langfuse_enable: true
  logging_level: DEBUG
  development: false

.env:

COMPOSE_PROJECT_NAME=wrenai
PLATFORM=linux/amd64

PROJECT_DIR=.

# service port
WREN_ENGINE_PORT=8080
WREN_ENGINE_SQL_PORT=7432
WREN_AI_SERVICE_PORT=5555
WREN_UI_PORT=3000
IBIS_SERVER_PORT=8000
WREN_UI_ENDPOINT=http://wren-ui:${WREN_UI_PORT}

# ai service settings
QDRANT_HOST=qdrant
SHOULD_FORCE_DEPLOY=1

# vendor keys
LLM_OPENAI_API_KEY=
EMBEDDER_OPENAI_API_KEY=
LLM_AZURE_OPENAI_API_KEY=
EMBEDDER_AZURE_OPENAI_API_KEY=
QDRANT_API_KEY=
LLM_DEEPSEEK_API_KEY=sk-f8dd6fb2d3cd43489f068ed516ab539d

# version
# CHANGE THIS TO THE LATEST VERSION
WREN_PRODUCT_VERSION=0.14.0
WREN_ENGINE_VERSION=0.13.1
WREN_AI_SERVICE_VERSION=0.14.0
IBIS_SERVER_VERSION=0.13.1
WREN_UI_VERSION=0.19.1
WREN_BOOTSTRAP_VERSION=0.1.5

# user id (uuid v4)
USER_UUID=

# for other services
POSTHOG_API_KEY=phc_nhF32aj4xHXOZb0oqr2cn4Oy9uiWzz6CCP4KZmRq9aE
POSTHOG_HOST=https://app.posthog.com
TELEMETRY_ENABLED=true
# this is for telemetry to know the model, i think ai-service might be able to provide a endpoint to get the information
GENERATION_MODEL=gpt-4o-mini
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=

# the port exposes to the host
# OPTIONAL: change the port if you have a conflict
HOST_PORT=32333
AI_SERVICE_FORWARD_PORT=5555

# Wren UI
EXPERIMENTAL_ENGINE_RUST_VERSION=false

LLM_LM_STUDIO_API_KEY=abxdxxddxedex
EMBEDDER_OLLAMA_URL=http://172.20.154.233:11434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants