From 9269e829a2de28e6a0ed56d32db9e53ae3bd5fed Mon Sep 17 00:00:00 2001 From: slobentanzer Date: Wed, 14 Feb 2024 13:31:00 +0100 Subject: [PATCH] describe KG prompt generation --- content/40.methods.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/content/40.methods.md b/content/40.methods.md index ca48df5..8f5d0e4 100644 --- a/content/40.methods.md +++ b/content/40.methods.md @@ -74,9 +74,14 @@ In the BioCypher KG creation, we use a configuration file to map KG contents to For instance, we detail the properties of a node and the source and target classes of an edge. Additionally, during the KG build process, we enrich this information and save it to a YAML file and, optionally, directly to the KG. This information is used by BioChatter to tune its understanding of the KG, which allows the LLM to query the KG more efficiently. + By understanding the context of the KG, the exact contents, and the exact spelling of all identifiers and properties, we effectively support the LLM in generating correct queries. -To illustrate the usage of this feature, we provide a demonstration repository at [https://github.com/biocypher/pole](https://github.com/biocypher/pole) including a KG build procedure and web app, which can be run using a single Docker Compose command. -The pole KG can also be used in conjunction with the BioChatter Next app by using the `docker-compose-incl-kg.yaml` file to build the application locally. +The query generation process is broken up into multiple steps by BioChatter: recognising entities and relationships according to the user's question, estimating properties to be used in the query, and generating a syntactically correct query in the query language of the database, based on the results from the previous steps and constraints given by the KG schema information. +This procedure is implemented in the `prompts.py` module. +To evaluate the quality of this process, we dedicate a module in the benchmark to the query generation process with a range of questions and KG schemata. + +To illustrate the usage of this feature, we provide a demonstration repository at [https://github.com/biocypher/pole](https://github.com/biocypher/pole) including a KG build procedure and an instance of BioChatter Light, which can be run using a single Docker Compose command. +The pole KG can also be used in conjunction with the BioChatter Next app by using the `docker-compose.yaml` file to build the application locally. A demonstration of this use case is available in [Supplementary Note 1: Knowledge Graph Retrieval-Augmented Generation] and on our website ([https://biochatter.org/vignette-kg/](https://biochatter.org/vignette-kg/)). ### Retrieval-Augmented Generation