diff --git a/site/en/integrations/build_RAG_with_milvus_and_cognee.md b/site/en/integrations/build_RAG_with_milvus_and_cognee.md new file mode 100644 index 000000000..80ead4c01 --- /dev/null +++ b/site/en/integrations/build_RAG_with_milvus_and_cognee.md @@ -0,0 +1,224 @@ +--- +id: build_RAG_with_milvus_and_cognee.md +summary: In this tutorial, we will show you how to build a RAG (Retrieval-Augmented Generation) pipeline with Milvus and Cognee. +title: Build RAG with Milvus and Cognee +--- + + + Open In Colab + + + GitHub Repository + + +### Build RAG with Milvus and Cognee + +[Cognee](https://www.cognee.ai) is a developer-first platform that streamlines AI application development with scalable, modular ECL (Extract, Cognify, Load) pipelines. By integrating seamlessly with Milvus, Cognee enables efficient connection and retrieval of conversations, documents, and transcriptions, reducing hallucinations and optimizing operational costs. + +With strong support for vector stores like Milvus, graph databases, and LLMs, Cognee provides a flexible and customizable framework for building retrieval-augmented generation (RAG) systems. Its production-ready architecture ensures improved accuracy and efficiency for AI-powered applications. + +In this tutorial, we will show you how to build a RAG (Retrieval-Augmented Generation) pipeline with Milvus and Cognee. + + + +```shell +$ pip install pymilvus git+https://github.com/topoteretes/cognee.git +``` + +> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu). + +By default, it use OpenAI as the LLM in this example. You should prepare the [api key](https://platform.openai.com/docs/quickstart), and set it in the config `set_llm_api_key()` function. + +To configure Milvus as the vector database, set the `VECTOR_DB_PROVIDER` to `milvus` and specify the `VECTOR_DB_URL` and `VECTOR_DB_KEY`. Since we are using Milvus Lite to store data in this demo, only the `VECTOR_DB_URL` needs to be provided. + + +```python +import os + +import cognee + +cognee.config.set_llm_api_key("YOUR_OPENAI_API_KEY") + + +os.environ["VECTOR_DB_PROVIDER"] = "milvus" +os.environ["VECTOR_DB_URL"] = "./milvus.db" +``` + +
+ +As for the environment variables `VECTOR_DB_URL` and `VECTOR_DB_KEY`: +- Setting the `VECTOR_DB_URL` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file. +- If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `VECTOR_DB_URL`. +- If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `VECTOR_DB_URL` and `VECTOR_DB_KEY`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud. + + + +### Prepare the data + +We use the FAQ pages from the [Milvus Documentation 2.4.x](https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip) as the private knowledge in our RAG, which is a good data source for a simple RAG pipeline. + +Download the zip file and extract documents to the folder `milvus_docs`. + + +```shell +$ wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip +$ unzip -q milvus_docs_2.4.x_en.zip -d milvus_docs +``` + +We load all markdown files from the folder `milvus_docs/en/faq`. For each document, we just simply use "# " to separate the content in the file, which can roughly separate the content of each main part of the markdown file. + + +```python +from glob import glob + +text_lines = [] + +for file_path in glob("milvus_docs/en/faq/*.md", recursive=True): + with open(file_path, "r") as file: + file_text = file.read() + + text_lines += file_text.split("# ") +``` + +## Build RAG + +### Resetting Cognee Data + + +```python +await cognee.prune.prune_data() +await cognee.prune.prune_system(metadata=True) +``` + +With a clean slate ready, we can now add our dataset and process it into a knowledge graph. + +### Adding Data and Cognifying + + +```python +await cognee.add(data=text_lines, dataset_name="milvus_faq") +await cognee.cognify() + +# [DocumentChunk(id=UUID('6889e7ef-3670-555c-bb16-3eb50d1d30b0'), updated_at=datetime.datetime(2024, 12, 4, 6, 29, 46, 472907, tzinfo=datetime.timezone.utc), text='Does the query perform in memory? What are incremental data and historical data?\n\nYes. When ... +# ... +``` + +The `add` method loads the dataset (Milvus FAQs) into Cognee and the `cognify` method processes the data to extract entities, relationships, and summaries, constructing a knowledge graph. + +### Querying for Summaries + +Now that the data has been processed, let's query the knowledge graph. + + +```python +from cognee.api.v1.search import SearchType + +query_text = "How is data stored in milvus?" +search_results = await cognee.search(SearchType.SUMMARIES, query_text=query_text) + +print(search_results[0]) +``` + + {'id': 'de5c6713-e079-5d0b-b11d-e9bacd1e0d73', 'text': 'Milvus stores two data types: inserted data and metadata.'} + + +This query searches the knowledge graph for a summary related to the query text, and the most related candidate is printed. + +### Querying for Chunks + +Summaries offer high-level insights, but for more granular details, we can query specific chunks of data directly from the processed dataset. These chunks are derived from the original data that was added and analyzed during the knowledge graph creation. + + +```python +from cognee.api.v1.search import SearchType + +query_text = "How is data stored in milvus?" +search_results = await cognee.search(SearchType.CHUNKS, query_text=query_text) +``` + +Let's format and display it for better readability! + + +```python +def format_and_print(data): + print("ID:", data["id"]) + print("\nText:\n") + paragraphs = data["text"].split("\n\n") + for paragraph in paragraphs: + print(paragraph.strip()) + print() + + +format_and_print(search_results[0]) +``` + + ID: 4be01c4b-9ee5-541c-9b85-297883934ab3 + + Text: + + Where does Milvus store data? + + Milvus deals with two types of data, inserted data and metadata. + + Inserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS). + + Metadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd. + + ### + + + +In our previous steps, we queried the Milvus FAQ dataset for both summaries and specific chunks of data. While this provided detailed insights and granular information, the dataset was large, making it challenging to clearly visualize the dependencies within the knowledge graph. + +To address this, we will reset the Cognee environment and work with a smaller, more focused dataset. This will allow us to better demonstrate the relationships and dependencies extracted during the cognify process. By simplifying the data, we can clearly see how Cognee organizes and structures information in the knowledge graph. + +### Reset Cognee + + +```python +await cognee.prune.prune_data() +await cognee.prune.prune_system(metadata=True) +``` + +### Adding the Focused Dataset + +Here, a smaller dataset with only one line of text is added and processed to ensure a focused and easily interpretable knowledge graph. + + +```python +# We only use one line of text as the dataset, which simplifies the output later +text = """ + Natural language processing (NLP) is an interdisciplinary + subfield of computer science and information retrieval. + """ + +await cognee.add(text) +await cognee.cognify() +``` + +### Querying for Insights + +By focusing on this smaller dataset, we can now clearly analyze the relationships and structure within the knowledge graph. + + +```python +query_text = "Tell me about NLP" +search_results = await cognee.search(SearchType.INSIGHTS, query_text=query_text) + +for result_text in search_results: + print(result_text) + +# Example output: +# ({'id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': 'natural language processing', 'description': 'An interdisciplinary subfield of computer science and information retrieval.'}, {'relationship_name': 'is_a_subfield_of', 'source_node_id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'target_node_id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 15, 473137, tzinfo=datetime.timezone.utc)}, {'id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': 'computer science', 'description': 'The study of computation and information processing.'}) +# (...) +# +# It represents nodes and relationships in the knowledge graph: +# - The first element is the source node (e.g., 'natural language processing'). +# - The second element is the relationship between nodes (e.g., 'is_a_subfield_of'). +# - The third element is the target node (e.g., 'computer science'). +``` + +This output represents the results of a knowledge graph query, showcasing entities (nodes) and their relationships (edges) as extracted from the processed dataset. Each tuple includes a source entity, a relationship type, and a target entity, along with metadata like unique IDs, descriptions, and timestamps. The graph highlights key concepts and their semantic connections, providing a structured understanding of the dataset. + +Congratulations, you have learned the basic usage of cognee with Milvus. If you want to know more advanced usage of cognee, please refer to its official [page](https://github.com/topoteretes/cognee) . + diff --git a/site/en/integrations/build_RAG_with_milvus_and_gemini.md b/site/en/integrations/build_RAG_with_milvus_and_gemini.md new file mode 100644 index 000000000..d77097e66 --- /dev/null +++ b/site/en/integrations/build_RAG_with_milvus_and_gemini.md @@ -0,0 +1,287 @@ +--- +id: build_RAG_with_milvus_and_gemini.md +summary: In this tutorial, we will show you how to build a RAG (Retrieval-Augmented Generation) pipeline with Milvus and Gemini. We will use the Gemini model to generate text based on a given query. We will also use Milvus to store and retrieve the generated text. +title: Build RAG with Milvus and Gemini +--- + + + Open In Colab + + + GitHub Repository + + +# Build RAG with Milvus and Gemini + +The [Gemini API](https://ai.google.dev/gemini-api/docs) and [Google AI Studio](https://ai.google.dev/aistudio) help you start working with Google's latest models and turn your ideas into applications that scale. Gemini provides access to powerful language models like `Gemini-1.5-Flash`, `Gemini-1.5-Flash-8B`, and `Gemini-1.5-Pro` for tasks such as text generation, document processing, vision, audio analysis, and more. The API allows you to input long context with millions of tokens, fine-tune models for specific tasks, generate structured outputs like JSON, and leverage capabilities like semantic retrieval and code execution. + +In this tutorial, we will show you how to build a RAG (Retrieval-Augmented Generation) pipeline with Milvus and Gemini. We will use the Gemini model to generate text based on a given query. We will also use Milvus to store and retrieve the generated text. + + + +## Preparation +### Dependencies and Environment + + +```shell +$ pip install --upgrade pymilvus google-generativeai requests tqdm +``` + +
+ +If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu). + +
+ +You should first log in to the Google AI Studio platform and prepare the [api key](https://aistudio.google.com/apikey) `GEMINI_API_KEY` as an environment variable. + + +```python +import os + +os.environ["GEMINI_API_KEY"] = "***********" +``` + +### Prepare the data + +We use the FAQ pages from the [Milvus Documentation 2.4.x](https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip) as the private knowledge in our RAG, which is a good data source for a simple RAG pipeline. + +Download the zip file and extract documents to the folder `milvus_docs`. + + +```shell +$ wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip +$ unzip -q milvus_docs_2.4.x_en.zip -d milvus_docs +``` + +We load all markdown files from the folder `milvus_docs/en/faq`. For each document, we just simply use "# " to separate the content in the file, which can roughly separate the content of each main part of the markdown file. + + +```python +from glob import glob + +text_lines = [] + +for file_path in glob("milvus_docs/en/faq/*.md", recursive=True): + with open(file_path, "r") as file: + file_text = file.read() + + text_lines += file_text.split("# ") +``` + +### Prepare the LLM and Embedding Model + +We use the `gemini-1.5-flash` as LLM, and the `text-embedding-004` as embedding model. + +Let's try to generate a test response from the LLM: + + +```python +import google.generativeai as genai + +genai.configure(api_key=os.environ["GEMINI_API_KEY"]) + +gemini_model = genai.GenerativeModel("gemini-1.5-flash") + +response = gemini_model.generate_content("who are you") +print(response.text) +``` + + I am a large language model, trained by Google. I am an AI and don't have a personal identity or consciousness. My purpose is to process information and respond to a wide range of prompts and questions in a helpful and informative way. + + + +Generate a test embedding and print its dimension and first few elements. + + +```python +test_embeddings = genai.embed_content( + model="models/text-embedding-004", content=["This is a test1", "This is a test2"] +)["embedding"] + +embedding_dim = len(test_embeddings[0]) +print(embedding_dim) +print(test_embeddings[0][:10]) +``` + + 768 + [0.013588584, -0.004361838, -0.08481652, -0.039724775, 0.04723794, -0.0051557426, 0.026071774, 0.045514572, -0.016867816, 0.039378334] + + +## Load data into Milvus + +### Create the Collection + + +```python +from pymilvus import MilvusClient + +milvus_client = MilvusClient(uri="./milvus_demo.db") + +collection_name = "my_rag_collection" +``` + +
+ +As for the argument of `MilvusClient`: +- Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file. +- If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`. +- If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud. + +
+ +Check if the collection already exists and drop it if it does. + + +```python +if milvus_client.has_collection(collection_name): + milvus_client.drop_collection(collection_name) +``` + +Create a new collection with specified parameters. + +If we don't specify any field information, Milvus will automatically create a default `id` field for primary key, and a `vector` field to store the vector data. A reserved JSON field is used to store non-schema-defined fields and their values. + + +```python +milvus_client.create_collection( + collection_name=collection_name, + dimension=embedding_dim, + metric_type="IP", # Inner product distance + consistency_level="Strong", # Strong consistency level +) +``` + +### Insert data +Iterate through the text lines, create embeddings, and then insert the data into Milvus. + +Here is a new field `text`, which is a non-defined field in the collection schema. It will be automatically added to the reserved JSON dynamic field, which can be treated as a normal field at a high level. + + +```python +from tqdm import tqdm + +data = [] + +doc_embeddings = genai.embed_content( + model="models/text-embedding-004", content=text_lines +)["embedding"] + +for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")): + data.append({"id": i, "vector": doc_embeddings[i], "text": line}) + +milvus_client.insert(collection_name=collection_name, data=data) +``` + + Creating embeddings: 100%|██████████| 72/72 [00:00<00:00, 468201.38it/s] + + + + + + {'insert_count': 72, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], 'cost': 0} + + + +## Build RAG + +### Retrieve data for a query + +Let's specify a frequent question about Milvus. + + +```python +question = "How is data stored in milvus?" +``` + +Search for the question in the collection and retrieve the semantic top-3 matches. + + +```python +question_embedding = genai.embed_content( + model="models/text-embedding-004", content=question +)["embedding"] + +search_res = milvus_client.search( + collection_name=collection_name, + data=[question_embedding], + limit=3, # Return top 3 results + search_params={"metric_type": "IP", "params": {}}, # Inner product distance + output_fields=["text"], # Return the text field +) +``` + +Let's take a look at the search results of the query + + + +```python +import json + +retrieved_lines_with_distances = [ + (res["entity"]["text"], res["distance"]) for res in search_res[0] +] +print(json.dumps(retrieved_lines_with_distances, indent=4)) +``` + + [ + [ + " Where does Milvus store data?\n\nMilvus deals with two types of data, inserted data and metadata. \n\nInserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS).\n\nMetadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.\n\n###", + 0.8048275113105774 + ], + [ + "Does the query perform in memory? What are incremental data and historical data?\n\nYes. When a query request comes, Milvus searches both incremental data and historical data by loading them into memory. Incremental data are in the growing segments, which are buffered in memory before they reach the threshold to be persisted in storage engine, while historical data are from the sealed segments that are stored in the object storage. Incremental data and historical data together constitute the whole dataset to search.\n\n###", + 0.7574886679649353 + ], + [ + "What is the maximum dataset size Milvus can handle?\n\n \nTheoretically, the maximum dataset size Milvus can handle is determined by the hardware it is run on, specifically system memory and storage:\n\n- Milvus loads all specified collections and partitions into memory before running queries. Therefore, memory size determines the maximum amount of data Milvus can query.\n- When new entities and and collection-related schema (currently only MinIO is supported for data persistence) are added to Milvus, system storage determines the maximum allowable size of inserted data.\n\n###", + 0.7453608512878418 + ] + ] + + +### Use LLM to get a RAG response + +Convert the retrieved documents into a string format. + + +```python +context = "\n".join( + [line_with_distance[0] for line_with_distance in retrieved_lines_with_distances] +) +``` + +Define system and user prompts for the Lanage Model. This prompt is assembled with the retrieved documents from Milvus. + + +```python +SYSTEM_PROMPT = """ +Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided. +""" +USER_PROMPT = f""" +Use the following pieces of information enclosed in tags to provide an answer to the question enclosed in tags. + +{context} + + +{question} + +""" +``` + +Use the Gemini to generate a response based on the prompts. + + +```python +gemini_model = genai.GenerativeModel( + "gemini-1.5-flash", system_instruction=SYSTEM_PROMPT +) +response = gemini_model.generate_content(USER_PROMPT) +print(response.text) +``` + + Milvus stores data in two ways: Inserted data (vector data, scalar data, and collection-specific schema) is stored as an incremental log in persistent storage using object storage backends such as MinIO, AWS S3, Google Cloud Storage, Azure Blob Storage, Alibaba Cloud OSS, and Tencent Cloud Object Storage. Metadata, generated by each Milvus module, is stored in etcd. + + + +Great! We have successfully built a RAG pipeline with Milvus and Gemini. diff --git a/site/en/integrations/build_RAG_with_milvus_and_ollama.md b/site/en/integrations/build_RAG_with_milvus_and_ollama.md new file mode 100644 index 000000000..001c8c713 --- /dev/null +++ b/site/en/integrations/build_RAG_with_milvus_and_ollama.md @@ -0,0 +1,329 @@ +--- +id: build_RAG_with_milvus_and_ollama.md +summary: In this guide, we’ll show you how to leverage Ollama and Milvus to build a RAG (Retrieval-Augmented Generation) pipeline efficiently and securely. +title: Build RAG with Milvus and Ollama +--- + + + Open In Colab + + + GitHub Repository + + +# Build RAG with Milvus and Ollama + +[Ollama](https://ollama.com/) is an open-source platform that simplifies running and customizing large language models (LLMs) locally. It provides a user-friendly, cloud-free experience, enabling effortless model downloads, installation, and interaction without requiring advanced technical skills. With a growing library of pre-trained LLMs—from general-purpose to domain-specific—Ollama makes it easy to manage and customize models for various applications. It ensures data privacy and flexibility, empowering users to fine-tune, optimize, and deploy AI-driven solutions entirely on their machines. + +In this guide, we’ll show you how to leverage Ollama and Milvus to build a RAG (Retrieval-Augmented Generation) pipeline efficiently and securely. + + + +## Preparation +### Dependencies and Environment + + +```shell +$ pip install pymilvus ollama +``` + +
+ +If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu). + +
+ +### Prepare the data + +We use the FAQ pages from the [Milvus Documentation 2.4.x](https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip) as the private knowledge in our RAG, which is a good data source for a simple RAG pipeline. + +Download the zip file and extract documents to the folder `milvus_docs`. + + +```shell +$ wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip +$ unzip -q milvus_docs_2.4.x_en.zip -d milvus_docs +``` + + --2024-11-26 21:47:19-- https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip + Resolving github.com (github.com)... 140.82.112.4 + Connecting to github.com (github.com)|140.82.112.4|:443... connected. + HTTP request sent, awaiting response... 302 Found + Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/267273319/c52902a0-e13c-4ca7-92e0-086751098a05?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241127%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241127T024720Z&X-Amz-Expires=300&X-Amz-Signature=7808b77cbdaa7e122196bcd75a73f29f2540333a350c4830bbdf5f286e876304&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dmilvus_docs_2.4.x_en.zip&response-content-type=application%2Foctet-stream [following] + --2024-11-26 21:47:20-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/267273319/c52902a0-e13c-4ca7-92e0-086751098a05?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241127%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241127T024720Z&X-Amz-Expires=300&X-Amz-Signature=7808b77cbdaa7e122196bcd75a73f29f2540333a350c4830bbdf5f286e876304&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dmilvus_docs_2.4.x_en.zip&response-content-type=application%2Foctet-stream + Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ... + Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected. + HTTP request sent, awaiting response... 200 OK + Length: 613094 (599K) [application/octet-stream] + Saving to: ‘milvus_docs_2.4.x_en.zip’ + + milvus_docs_2.4.x_e 100%[===================>] 598.72K 1.20MB/s in 0.5s + + 2024-11-26 21:47:20 (1.20 MB/s) - ‘milvus_docs_2.4.x_en.zip’ saved [613094/613094] + + + +We load all markdown files from the folder `milvus_docs/en/faq`. For each document, we just simply use "# " to separate the content in the file, which can roughly separate the content of each main part of the markdown file. + + +```python +from glob import glob + +text_lines = [] + +for file_path in glob("milvus_docs/en/faq/*.md", recursive=True): + with open(file_path, "r") as file: + file_text = file.read() + + text_lines += file_text.split("# ") +``` + +### Prepare the LLM and Embedding Model + +Ollama supports multiple models for both LLM-based tasks and embedding generation, making it easy to develop retrieval-augmented generation (RAG) applications. For this setup: + +- We will use **Llama 3.2 (3B)** as our LLM for text generation tasks. +- For embedding generation, we will use **mxbai-embed-large**, a 334M parameter model optimized for semantic similarity. + +Before starting, ensure both models are pulled locally: + + +```python +! ollama pull mxbai-embed-large +``` + + [?25lpulling manifest ⠋ [?25h[?25lpulling manifest ⠙ [?25h[?25lpulling manifest ⠹ [?25h[?25lpulling manifest ⠸ [?25h[?25lpulling manifest ⠼ [?25h[?25lpulling manifest ⠴ [?25h[?25lpulling manifest + pulling 819c2adf5ce6... 100% ▕████████████████▏ 669 MB + pulling c71d239df917... 100% ▕████████████████▏ 11 KB + pulling b837481ff855... 100% ▕████████████████▏ 16 B + pulling 38badd946f91... 100% ▕████████████████▏ 408 B + verifying sha256 digest + writing manifest + success [?25h + + + +```python +! ollama pull llama3.2 +``` + + [?25lpulling manifest ⠋ [?25h[?25lpulling manifest ⠙ [?25h[?25lpulling manifest ⠹ [?25h[?25lpulling manifest ⠸ [?25h[?25lpulling manifest ⠼ [?25h[?25lpulling manifest ⠴ [?25h[?25lpulling manifest + pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB + pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB + pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB + pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB + pulling 56bb8bd477a5... 100% ▕████████████████▏ 96 B + pulling 34bb5ab01051... 100% ▕████████████████▏ 561 B + verifying sha256 digest + writing manifest + success [?25h + + +With these models ready, we can proceed to implement LLM-driven generation and embedding-based retrieval workflows. + + + +```python +import ollama + + +def emb_text(text): + response = ollama.embeddings(model="mxbai-embed-large", prompt=text) + return response["embedding"] +``` + +Generate a test embedding and print its dimension and first few elements. + + +```python +test_embedding = emb_text("This is a test") +embedding_dim = len(test_embedding) +print(embedding_dim) +print(test_embedding[:10]) +``` + + 1024 + [0.23276396095752716, 0.4257211685180664, 0.19724100828170776, 0.46120673418045044, -0.46039995551109314, -0.1413791924715042, -0.18261606991291046, -0.07602324336767197, 0.39991313219070435, 0.8337644338607788] + + +## Load data into Milvus + +### Create the Collection + + +```python +from pymilvus import MilvusClient + +milvus_client = MilvusClient(uri="./milvus_demo.db") + +collection_name = "my_rag_collection" +``` + +
+ +As for the argument of `MilvusClient`: +- Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file. +- If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`. +- If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud. + +
+ +Check if the collection already exists and drop it if it does. + + +```python +if milvus_client.has_collection(collection_name): + milvus_client.drop_collection(collection_name) +``` + +Create a new collection with specified parameters. + +If we don't specify any field information, Milvus will automatically create a default `id` field for primary key, and a `vector` field to store the vector data. A reserved JSON field is used to store non-schema-defined fields and their values. + + +```python +milvus_client.create_collection( + collection_name=collection_name, + dimension=embedding_dim, + metric_type="IP", # Inner product distance + consistency_level="Strong", # Strong consistency level +) +``` + +### Insert data +Iterate through the text lines, create embeddings, and then insert the data into Milvus. + +Here is a new field `text`, which is a non-defined field in the collection schema. It will be automatically added to the reserved JSON dynamic field, which can be treated as a normal field at a high level. + + +```python +from tqdm import tqdm + +data = [] + +for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")): + data.append({"id": i, "vector": emb_text(line), "text": line}) + +milvus_client.insert(collection_name=collection_name, data=data) +``` + + Creating embeddings: 100%|██████████| 72/72 [00:03<00:00, 22.56it/s] + + + + + + {'insert_count': 72, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], 'cost': 0} + + + +## Build RAG + +### Retrieve data for a query + +Let's specify a frequent question about Milvus. + + +```python +question = "How is data stored in milvus?" +``` + +Search for the question in the collection and retrieve the semantic top-3 matches. + + +```python +search_res = milvus_client.search( + collection_name=collection_name, + data=[ + emb_text(question) + ], # Use the `emb_text` function to convert the question to an embedding vector + limit=3, # Return top 3 results + search_params={"metric_type": "IP", "params": {}}, # Inner product distance + output_fields=["text"], # Return the text field +) +``` + +Let's take a look at the search results of the query + + + +```python +import json + +retrieved_lines_with_distances = [ + (res["entity"]["text"], res["distance"]) for res in search_res[0] +] +print(json.dumps(retrieved_lines_with_distances, indent=4)) +``` + + [ + [ + " Where does Milvus store data?\n\nMilvus deals with two types of data, inserted data and metadata. \n\nInserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS).\n\nMetadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.\n\n###", + 231.9398193359375 + ], + [ + "How does Milvus flush data?\n\nMilvus returns success when inserted data are loaded to the message queue. However, the data are not yet flushed to the disk. Then Milvus' data node writes the data in the message queue to persistent storage as incremental logs. If `flush()` is called, the data node is forced to write all data in the message queue to persistent storage immediately.\n\n###", + 226.48316955566406 + ], + [ + "What is the maximum dataset size Milvus can handle?\n\n \nTheoretically, the maximum dataset size Milvus can handle is determined by the hardware it is run on, specifically system memory and storage:\n\n- Milvus loads all specified collections and partitions into memory before running queries. Therefore, memory size determines the maximum amount of data Milvus can query.\n- When new entities and and collection-related schema (currently only MinIO is supported for data persistence) are added to Milvus, system storage determines the maximum allowable size of inserted data.\n\n###", + 210.60745239257812 + ] + ] + + +### Use LLM to get a RAG response + +Convert the retrieved documents into a string format. + + +```python +context = "\n".join( + [line_with_distance[0] for line_with_distance in retrieved_lines_with_distances] +) +``` + +Define system and user prompts for the Lanage Model. This prompt is assembled with the retrieved documents from Milvus. + + +```python +SYSTEM_PROMPT = """ +Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided. +""" +USER_PROMPT = f""" +Use the following pieces of information enclosed in tags to provide an answer to the question enclosed in tags. + +{context} + + +{question} + +""" +``` + +Use the `llama3.2` model provided by Ollama to generate a response based on the prompts. + + + +```python +from ollama import chat +from ollama import ChatResponse + +response: ChatResponse = chat( + model="llama3.2", + messages=[ + {"role": "system", "content": SYSTEM_PROMPT}, + {"role": "user", "content": USER_PROMPT}, + ], +) +print(response["message"]["content"]) +``` + + According to the provided context, data in Milvus is stored in two types: + + 1. **Inserted data**: Storing data in persistent storage as incremental log. It supports multiple object storage backends such as MinIO, AWS S3, Google Cloud Storage (GCS), Azure Blob Storage, Alibaba Cloud OSS, and Tencent Cloud Object Storage. + + 2. **Metadata**: Generated within Milvus and stored in etcd. + + +Great! We have successfully built a RAG pipeline with Milvus and Ollama. diff --git a/site/en/integrations/integrations_overview.md b/site/en/integrations/integrations_overview.md index 2d403efe8..2082eee43 100644 --- a/site/en/integrations/integrations_overview.md +++ b/site/en/integrations/integrations_overview.md @@ -55,3 +55,7 @@ This page provides a list of tutorials for you to interact with Milvus and third | [Knowledge Table with Milvus](knowledge_table_with_milvus.md) | Knowledge Engineering | Knowledge Table, Milvus | | [Use Milvus in DocsGPT](use_milvus_in_docsgpt.md) | Ochestration | DocsGPT, Milvus | | [Use Milvus with SambaNova](use_milvus_with_sambanova.md) | Orchestration | Milvus, SambaNova | +| [Build RAG with Milvus and Cognee](build_RAG_with_milvus_and_cognee.md) | Knowledge Engineering | Milvus, Cognee | +| [Build RAG with Milvus and Gemini](build_RAG_with_milvus_and_gemini.md) | LLMs | Milvus, Gemini | +| [Build RAG with Milvus and Ollama](build_RAG_with_milvus_and_ollama.md) | LLMs | Milvus, Ollama | +| [Getting Started with Dynamiq and Milvus](milvus_rag_with_dynamiq.md) | Orchestration | Milvus, Dynamiq | diff --git a/site/en/integrations/milvus_rag_with_dynamiq.md b/site/en/integrations/milvus_rag_with_dynamiq.md new file mode 100644 index 000000000..b5a0033df --- /dev/null +++ b/site/en/integrations/milvus_rag_with_dynamiq.md @@ -0,0 +1,376 @@ +--- +id: milvus_rag_with_dynamiq.md +summary: In this tutorial, we’ll explore how to seamlessly use Dynamiq with Milvus, the high-performance vector database purpose-built for RAG workflows. Milvus excels at efficient storage, indexing, and retrieval of vector embeddings, making it an indispensable component for AI systems that demand fast and precise contextual data access. +title: Getting Started with Dynamiq and Milvus +--- + + + Open In Colab + + + GitHub Repository + + +# Getting Started with Dynamiq and Milvus + +[Dynamiq](https://www.getdynamiq.ai/) is a powerful Gen AI framework that streamlines the development of AI-powered applications. With robust support for retrieval-augmented generation (RAG) and large language model (LLM) agents, Dynamiq empowers developers to create intelligent, dynamic systems with ease and efficiency. + +In this tutorial, we’ll explore how to seamlessly use Dynamiq with [Milvus](https://milvus.io/), the high-performance vector database purpose-built for RAG workflows. Milvus excels at efficient storage, indexing, and retrieval of vector embeddings, making it an indispensable component for AI systems that demand fast and precise contextual data access. + +This step-by-step guide will cover two core RAG workflows: + +- **Document Indexing Flow**: Learn how to process input files (e.g., PDFs), transform their content into vector embeddings, and store them in Milvus. Leveraging Milvus’s high-performance indexing capabilities ensures your data is ready for rapid retrieval. + +- **Document Retrieval Flow**: Discover how to query Milvus for relevant document embeddings and use them to generate insightful, context-aware responses with Dynamiq’s LLM agents, creating a seamless AI-powered user experience. + +By the end of this tutorial, you’ll gain a solid understanding of how Milvus and Dynamiq work together to build scalable, context-aware AI systems tailored to your needs. + +## Preparation + + +### Download required libraries + + + +```shell +$ pip install dynamiq pymilvus +``` + +
+ +If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu). + +
+ +### Configure the LLM agent + +We will use OpenAI as the LLM in this example. You should prepare the [api key](https://platform.openai.com/docs/quickstart) `OPENAI_API_KEY` as an environment variable. + + + +```python +import os + +os.environ["OPENAI_API_KEY"] = "sk-***********" +``` + +## RAG - Document Indexing Flow + +This tutorial demonstrates a Retrieval-Augmented Generation (RAG) workflow for indexing documents with Milvus as the vector database. The workflow takes input PDF files, processes them into smaller chunks, generates vector embeddings using OpenAI's embedding model, and stores the embeddings in a Milvus collection for efficient retrieval. + +By the end of this workflow, you will have a scalable and efficient document indexing system that supports future RAG tasks like semantic search and question answering. + +### Import Required Libraries and Initialize Workflow + + +```python +# Importing necessary libraries for the workflow +from io import BytesIO +from dynamiq import Workflow +from dynamiq.nodes import InputTransformer +from dynamiq.connections import ( + OpenAI as OpenAIConnection, + Milvus as MilvusConnection, + MilvusDeploymentType, +) +from dynamiq.nodes.converters import PyPDFConverter +from dynamiq.nodes.splitters.document import DocumentSplitter +from dynamiq.nodes.embedders import OpenAIDocumentEmbedder +from dynamiq.nodes.writers import MilvusDocumentWriter + +# Initialize the workflow +rag_wf = Workflow() +``` + +### Define PDF Converter Node + + +```python +converter = PyPDFConverter(document_creation_mode="one-doc-per-page") +converter_added = rag_wf.flow.add_nodes( + converter +) # Add node to the DAG (Directed Acyclic Graph) +``` + +### Define Document Splitter Node + + +```python +document_splitter = DocumentSplitter( + split_by="sentence", # Splits documents into sentences + split_length=10, + split_overlap=1, + input_transformer=InputTransformer( + selector={ + "documents": f"${[converter.id]}.output.documents", + }, + ), +).depends_on( + converter +) # Set dependency on the PDF converter +splitter_added = rag_wf.flow.add_nodes(document_splitter) # Add to the DAG +``` + +### Define Embedding Node + + +```python +embedder = OpenAIDocumentEmbedder( + connection=OpenAIConnection(api_key=os.environ["OPENAI_API_KEY"]), + input_transformer=InputTransformer( + selector={ + "documents": f"${[document_splitter.id]}.output.documents", + }, + ), +).depends_on( + document_splitter +) # Set dependency on the splitter +document_embedder_added = rag_wf.flow.add_nodes(embedder) # Add to the DAG +``` + +### Define Milvus Vector Store Node + + +```python +vector_store = ( + MilvusDocumentWriter( + connection=MilvusConnection( + deployment_type=MilvusDeploymentType.FILE, uri="./milvus.db" + ), + index_name="my_milvus_collection", + dimension=1536, + create_if_not_exist=True, + metric_type="COSINE", + ) + .inputs(documents=embedder.outputs.documents) # Connect to embedder output + .depends_on(embedder) # Set dependency on the embedder +) +milvus_writer_added = rag_wf.flow.add_nodes(vector_store) # Add to the DAG +``` + + 2024-11-19 22:14:03 - WARNING - Environment variable 'MILVUS_API_TOKEN' not found + 2024-11-19 22:14:03 - INFO - Pass in the local path ./milvus.db, and run it using milvus-lite + 2024-11-19 22:14:04 - DEBUG - Created new connection using: 0bef2849fdb1458a85df8bb9dd27f51d + 2024-11-19 22:14:04 - INFO - Collection my_milvus_collection does not exist. Creating a new collection. + 2024-11-19 22:14:04 - DEBUG - Successfully created collection: my_milvus_collection + 2024-11-19 22:14:05 - DEBUG - Successfully created an index on collection: my_milvus_collection + 2024-11-19 22:14:05 - DEBUG - Successfully created an index on collection: my_milvus_collection + + +
+ +Milvus offers two deployment types, catering to different use cases: + + +1. **MilvusDeploymentType.FILE** + +- Ideal for **local prototyping** or **small-scale data** storage. +- Set the `uri` to a local file path (e.g., `./milvus.db`) to leverage [Milvus Lite](https://milvus.io/docs/milvus_lite.md), which automatically stores all data in the specified file. +- This is a convenient option for **quick setup** and **experimentation**. + + +2. **MilvusDeploymentType.HOST** + +- Designed for **large-scale data** scenarios, such as managing over a million vectors. + + **Self-Hosted Server** + + - Deploy a high-performance Milvus server using [Docker or Kubernetes](https://milvus.io/docs/quickstart.md). + - Configure the server’s address and port as the `uri` (e.g., `http://localhost:19530`). + - If authentication is enabled: + - Provide `:` as the `token`. + - If authentication is disabled: + - Leave the `token` unset. + + **Zilliz Cloud (Managed Service)** + + - For a fully managed, cloud-based Milvus experience, use [Zilliz Cloud](https://zilliz.com/cloud). + - Set the `uri` and `token` according to the [Public Endpoint and API key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#cluster-details) provided in the Zilliz Cloud console. + +
+ +### Define Input Data and Run the Workflow + + +```python +file_paths = ["./pdf_files/WhatisMilvus.pdf"] +input_data = { + "files": [BytesIO(open(path, "rb").read()) for path in file_paths], + "metadata": [{"filename": path} for path in file_paths], +} + +# Run the workflow with the prepared input data +inserted_data = rag_wf.run(input_data=input_data) +``` + + /var/folders/09/d0hx80nj35sb5hxb5cpc1q180000gn/T/ipykernel_31319/3145804345.py:4: ResourceWarning: unclosed file <_io.BufferedReader name='./pdf_files/WhatisMilvus.pdf'> + BytesIO(open(path, "rb").read()) for path in file_paths + ResourceWarning: Enable tracemalloc to get the object allocation traceback + 2024-11-19 22:14:09 - INFO - Workflow 87878444-6a3d-43f3-ae32-0127564a959f: execution started. + 2024-11-19 22:14:09 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution started. + 2024-11-19 22:14:09 - INFO - Node PyPDF File Converter - 6eb42b1f-7637-407b-a3ac-4167bcf3b5c4: execution started. + 2024-11-19 22:14:09 - INFO - Node PyPDF File Converter - 6eb42b1f-7637-407b-a3ac-4167bcf3b5c4: execution succeeded in 58ms. + 2024-11-19 22:14:09 - INFO - Node DocumentSplitter - 5baed580-6de0-4dcd-bace-d7d947ab6c7f: execution started. + /Users/jinhonglin/anaconda3/envs/myenv/lib/python3.11/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions + warnings.warn( # deprecated in 14.0 - 2024-11-09 + /Users/jinhonglin/anaconda3/envs/myenv/lib/python3.11/site-packages/pydantic/fields.py:804: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'is_accessible_to_agent'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/ + warn( + 2024-11-19 22:14:09 - INFO - Node DocumentSplitter - 5baed580-6de0-4dcd-bace-d7d947ab6c7f: execution succeeded in 104ms. + 2024-11-19 22:14:09 - INFO - Node OpenAIDocumentEmbedder - 91928f67-a00f-48f6-a864-f6e21672ec7e: execution started. + 2024-11-19 22:14:09 - INFO - Node OpenAIDocumentEmbedder - d30a4cdc-0fab-4aff-b2e5-6161a62cb6fd: execution started. + 2024-11-19 22:14:10 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK" + 2024-11-19 22:14:10 - INFO - Node OpenAIDocumentEmbedder - d30a4cdc-0fab-4aff-b2e5-6161a62cb6fd: execution succeeded in 724ms. + 2024-11-19 22:14:10 - INFO - Node MilvusDocumentWriter - dddab4cc-1dae-4e7e-9101-1ec353f530da: execution started. + 2024-11-19 22:14:10 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK" + 2024-11-19 22:14:10 - INFO - Node MilvusDocumentWriter - dddab4cc-1dae-4e7e-9101-1ec353f530da: execution succeeded in 66ms. + 2024-11-19 22:14:10 - INFO - Node OpenAIDocumentEmbedder - 91928f67-a00f-48f6-a864-f6e21672ec7e: execution succeeded in 961ms. + 2024-11-19 22:14:10 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution succeeded in 1.3s. + 2024-11-19 22:14:10 - INFO - Workflow 87878444-6a3d-43f3-ae32-0127564a959f: execution succeeded in 1.3s. + + +Through this workflow, we have successfully implemented a document indexing pipeline using Milvus as the vector database and OpenAI's embedding model for semantic representation. This setup enables fast and accurate vector-based retrieval, forming the foundation for RAG workflows like semantic search, document retrieval, and contextual AI-driven interactions. + +With Milvus's scalable storage capabilities and Dynamiq's orchestration, this solution is ready for both prototyping and large-scale production deployments. You can now extend this pipeline to include additional tasks like retrieval-based question answering or AI-driven content generation. + +## RAG Document Retrieval Flow + +In this tutorial, we implement a Retrieval-Augmented Generation (RAG) document retrieval workflow. This workflow takes a user query, generates a vector embedding for it, retrieves the most relevant documents from a Milvus vector database, and uses a large language model (LLM) to generate a detailed and context-aware answer based on the retrieved documents. + +By following this workflow, you will create an end-to-end solution for semantic search and question answering, combining the power of vector-based document retrieval with the capabilities of OpenAI’s advanced LLMs. This approach enables efficient and intelligent responses to user queries by leveraging the stored knowledge in your document database. + +### Import Required Libraries and Initialize Workflow + + +```python +from dynamiq import Workflow +from dynamiq.connections import ( + OpenAI as OpenAIConnection, + Milvus as MilvusConnection, + MilvusDeploymentType, +) +from dynamiq.nodes.embedders import OpenAITextEmbedder +from dynamiq.nodes.retrievers import MilvusDocumentRetriever +from dynamiq.nodes.llms import OpenAI +from dynamiq.prompts import Message, Prompt + +# Initialize the workflow +retrieval_wf = Workflow() +``` + +### Define OpenAI Connection and Text Embedder + + +```python +# Establish OpenAI connection +openai_connection = OpenAIConnection(api_key=os.environ["OPENAI_API_KEY"]) + +# Define the text embedder node +embedder = OpenAITextEmbedder( + connection=openai_connection, + model="text-embedding-3-small", +) + +# Add the embedder node to the workflow +embedder_added = retrieval_wf.flow.add_nodes(embedder) +``` + +### Define Milvus Document Retriever + + +```python +document_retriever = ( + MilvusDocumentRetriever( + connection=MilvusConnection( + deployment_type=MilvusDeploymentType.FILE, uri="./milvus.db" + ), + index_name="my_milvus_collection", + dimension=1536, + top_k=5, + ) + .inputs(embedding=embedder.outputs.embedding) # Connect to embedder output + .depends_on(embedder) # Dependency on the embedder node +) + +# Add the retriever node to the workflow +milvus_retriever_added = retrieval_wf.flow.add_nodes(document_retriever) +``` + + 2024-11-19 22:14:19 - WARNING - Environment variable 'MILVUS_API_TOKEN' not found + 2024-11-19 22:14:19 - INFO - Pass in the local path ./milvus.db, and run it using milvus-lite + 2024-11-19 22:14:19 - DEBUG - Created new connection using: 98d1132773af4298a894ad5925845fd2 + 2024-11-19 22:14:19 - INFO - Collection my_milvus_collection already exists. Skipping creation. + + +### Define the Prompt Template + + +```python +# Define the prompt template for the LLM +prompt_template = """ +Please answer the question based on the provided context. + +Question: {{ query }} + +Context: +{% for document in documents %} +- {{ document.content }} +{% endfor %} +""" + +# Create the prompt object +prompt = Prompt(messages=[Message(content=prompt_template, role="user")]) +``` + +### Define the Answer Generator + + + +```python +answer_generator = ( + OpenAI( + connection=openai_connection, + model="gpt-4o", + prompt=prompt, + ) + .inputs( + documents=document_retriever.outputs.documents, + query=embedder.outputs.query, + ) + .depends_on( + [document_retriever, embedder] + ) # Dependencies on retriever and embedder +) + +# Add the answer generator node to the workflow +answer_generator_added = retrieval_wf.flow.add_nodes(answer_generator) +``` + +### Run the Workflow + + +```python +# Run the workflow with a sample query +sample_query = "What is the Advanced Search Algorithms in Milvus?" + +result = retrieval_wf.run(input_data={"query": sample_query}) + +answer = result.output.get(answer_generator.id).get("output", {}).get("content") +print(answer) +``` + + 2024-11-19 22:14:22 - INFO - Workflow f4a073fb-dfb6-499c-8cac-5710a7ad6d47: execution started. + 2024-11-19 22:14:22 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution started. + 2024-11-19 22:14:22 - INFO - Node OpenAITextEmbedder - 47afb0bc-cf96-429d-b58f-11b6c935fec3: execution started. + 2024-11-19 22:14:23 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK" + 2024-11-19 22:14:23 - INFO - Node OpenAITextEmbedder - 47afb0bc-cf96-429d-b58f-11b6c935fec3: execution succeeded in 474ms. + 2024-11-19 22:14:23 - INFO - Node MilvusDocumentRetriever - 51c8311b-4837-411f-ba42-21e28239a2ee: execution started. + 2024-11-19 22:14:23 - INFO - Node MilvusDocumentRetriever - 51c8311b-4837-411f-ba42-21e28239a2ee: execution succeeded in 23ms. + 2024-11-19 22:14:23 - INFO - Node LLM - ac722325-bece-453f-a2ed-135b0749ee7a: execution started. + 2024-11-19 22:14:24 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" + 2024-11-19 22:14:24 - INFO - Node LLM - ac722325-bece-453f-a2ed-135b0749ee7a: execution succeeded in 1.8s. + 2024-11-19 22:14:25 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution succeeded in 2.4s. + 2024-11-19 22:14:25 - INFO - Workflow f4a073fb-dfb6-499c-8cac-5710a7ad6d47: execution succeeded in 2.4s. + + + The advanced search algorithms in Milvus include a variety of in-memory and on-disk indexing/search algorithms such as IVF (Inverted File), HNSW (Hierarchical Navigable Small World), and DiskANN. These algorithms have been deeply optimized to enhance performance, delivering 30%-70% better performance compared to popular implementations like FAISS and HNSWLib. These optimizations are part of Milvus's design to ensure high efficiency and scalability in handling vector data. + diff --git a/site/en/menuStructure/en.json b/site/en/menuStructure/en.json index 39795db1f..62d84e618 100644 --- a/site/en/menuStructure/en.json +++ b/site/en/menuStructure/en.json @@ -1344,6 +1344,24 @@ "id": "build_RAG_with_milvus_and_siliconflow.md", "order": 4, "children": [] + }, + { + "label": "SambaNova", + "id": "use_milvus_with_sambanova.md", + "order": 5, + "children": [] + }, + { + "label": "Gemini", + "id": "build_RAG_with_milvus_and_gemini.md", + "order": 6, + "children": [] + }, + { + "label": "Ollama", + "id": "build_RAG_with_milvus_and_ollama.md", + "order": 7, + "children": [] } ] }, @@ -1433,15 +1451,15 @@ "children": [] }, { - "label": "SambaNova", - "id": "use_milvus_with_sambanova.md", + "label": "PrivateGPT", + "id": "use_milvus_in_private_gpt.md", "order": 10, "children": [] }, { - "label": "PrivateGPT", - "id": "use_milvus_in_private_gpt.md", - "order": 8, + "label": "Dynamiq", + "id": "milvus_rag_with_dynamiq.md", + "order": 11, "children": [] } ] @@ -1493,6 +1511,12 @@ "id": "knowledge_table_with_milvus.md", "order": 2, "children": [] + }, + { + "label": "Cognee", + "id": "build_RAG_with_milvus_and_cognee.md", + "order": 3, + "children": [] } ] }, diff --git a/site/en/tutorials/tutorials-overview.md b/site/en/tutorials/tutorials-overview.md index 4c6324c4f..520162020 100644 --- a/site/en/tutorials/tutorials-overview.md +++ b/site/en/tutorials/tutorials-overview.md @@ -30,3 +30,4 @@ This page provides a list of tutorials for you to interact with Milvus. | [Vector Visualization](vector_visualization.md) | Quickstart | vector search | | [Movie Recommendation with Milvus](movie_recommendation_with_milvus.md) | Recommendation System | vector search | | [Funnel Search with Matryoshka Embeddings](funnel_search_with_matryoshka.md) | Quickstart | vector search | + diff --git a/site/en/tutorials/use_ColPali_with_milvus.md b/site/en/tutorials/use_ColPali_with_milvus.md index 576dce4ad..f9a1c4a1b 100644 --- a/site/en/tutorials/use_ColPali_with_milvus.md +++ b/site/en/tutorials/use_ColPali_with_milvus.md @@ -15,7 +15,7 @@ title: Use ColPali for Multi-Modal Retrieval with Milvus Modern retrieval models typically use a single embedding to represent text or images. ColBERT, however, is a neural model that utilizes a list of embeddings for each data instance and employs a "MaxSim" operation to calculate the similarity between two texts. Beyond textual data, figures, tables, and diagrams also contain rich information, which is often disregarded in text-based information retrieval. -![](../../../assets/colpali_formula.png) +![](../../../images/colpali_formula.png) MaxSim function compares a query with a document (what you're searching in) by looking at their token embeddings. For each word in the query, it picks the most similar word from the document (using cosine similarity or squared L2 distance) and sums these maximum similarities across all words in the query @@ -27,7 +27,6 @@ ColPali is a method that combines ColBERT's multi-vector representation with Pal ## Preparation - ```shell $ pip install pdf2image $ pip pymilvus @@ -61,6 +60,7 @@ import concurrent.futures client = MilvusClient(uri="milvus.db") ``` +
- If you only need a local vector database for small scale data or prototyping, setting the uri as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file. @@ -167,7 +167,7 @@ class MilvusColbertRetriever: # Rerank a single document by retrieving its embeddings and calculating the similarity with the query. doc_colbert_vecs = client.query( collection_name=collection_name, - filter=f"doc_id in [{doc_id}, {doc_id + 1}]", + filter=f"doc_id in [{doc_id}]", output_fields=["seq_id", "vector", "doc"], limit=1000, ) diff --git a/site/en/userGuide/use-json-fields.md b/site/en/userGuide/use-json-fields.md index 93dbb9296..3780aebd9 100644 --- a/site/en/userGuide/use-json-fields.md +++ b/site/en/userGuide/use-json-fields.md @@ -658,9 +658,9 @@ for (List results : searchResults) { } // Output: -// SearchResp.SearchResult(entity={color={"label":"red","tag":1018,"coord":[3,30,1],"ref":[["yellow","brown","orange"],["yellow","purple","blue"],["green","purple","purple"]]}, id=295}, score=1.1190735, id=295) -// SearchResp.SearchResult(entity={color={"label":"red","tag":8141,"coord":[38,31,29],"ref":[["blue","white","white"],["green","orange","green"],["yellow","green","black"]]}, id=667}, score=1.0679582, id=667) -// SearchResp.SearchResult(entity={color={"label":"red","tag":6837,"coord":[29,9,8],"ref":[["green","black","blue"],["purple","white","green"],["red","blue","black"]]}, id=927}, score=1.0029297, id=927) +// SearchResp.SearchResult(entity=\{color=\{"label":"red","tag":1018,"coord":[3,30,1],"ref":[["yellow","brown","orange"],["yellow","purple","blue"],["green","purple","purple"]]}, id=295}, score=1.1190735, id=295) +// SearchResp.SearchResult(entity=\{color=\{"label":"red","tag":8141,"coord":[38,31,29],"ref":[["blue","white","white"],["green","orange","green"],["yellow","green","black"]]}, id=667}, score=1.0679582, id=667) +// SearchResp.SearchResult(entity=\{color=\{"label":"red","tag":6837,"coord":[29,9,8],"ref":[["green","black","blue"],["purple","white","green"],["red","blue","black"]]}, id=927}, score=1.0029297, id=927) ``` ```javascript