Merge pull request #59 from SylphAI-Inc/li

Li
SylphAI-Inc · Jun 30, 2024 · 2a5a260 · 2a5a260
2 parents a865b1d + 1260642
commit 2a5a260
Show file tree

Hide file tree

Showing 7 changed files with 204 additions and 57 deletions.
diff --git a/.env_example b/.env_example
@@ -1,2 +1,6 @@
 OPENAI_API_KEY=YOUR_API_KEY_IF_YOU_USE_OPENAI
 GROQ_API_KEY=YOUR_API_KEY_IF_YOU_USE_GROQ
+ANTHROPIC_API_KEY=YOUR_API_KEY_IF_YOU_USE_ANTHROPIC
+GOOGLE_API_KEY=YOUR_API_KEY_IF_YOU_USE_GOOGLE
+COHERE_API_KEY=YOUR_API_KEY_IF_YOU_USE_COHERE
+HF_TOKEN=YOUR_API_KEY_IF_YOU_USE_HF
diff --git a/README.md b/README.md
@@ -0,0 +1,102 @@
+# Introduction
+
+LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
+It is light, modular, and robust.
+
+**PyTorch**
+
+```python
+import torch
+import torch.nn as nn
+
+class Net(nn.Module):
+   def __init__(self):
+      super(Net, self).__init__()
+      self.conv1 = nn.Conv2d(1, 32, 3, 1)
+      self.conv2 = nn.Conv2d(32, 64, 3, 1)
+      self.dropout1 = nn.Dropout2d(0.25)
+      self.dropout2 = nn.Dropout2d(0.5)
+      self.fc1 = nn.Linear(9216, 128)
+      self.fc2 = nn.Linear(128, 10)
+
+   def forward(self, x):
+      x = self.conv1(x)
+      x = self.conv2(x)
+      x = self.dropout1(x)
+      x = self.dropout2(x)
+      x = self.fc1(x)
+      return self.fc2(x)
+
+**LightRAG**
+
+```python
+
+from lightrag.core import Component, Generator
+from lightrag.components.model_client import GroqAPIClient
+from lightrag.utils import setup_env #noqa
+
+class SimpleQA(Component):
+   def __init__(self):
+      super().__init__()
+      template = r"""<SYS>
+      You are a helpful assistant.
+      </SYS>
+      User: {{input_str}}
+      You:
+      """
+      self.generator = Generator(
+            model_client=GroqAPIClient(),
+            model_kwargs={"model": "llama3-8b-8192"},
+            template=template,
+      )
+
+   def call(self, query):
+      return self.generator({"input_str": query})
+
+   async def acall(self, query):
+      return await self.generator.acall({"input_str": query})
+```
+
+## Simplicity
+
+Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
+As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.
+
+* Two fundamental and powerful base classes: Component for the pipeline and DataClass for data interaction with LLMs.
+* We end up with less than two levels of subclasses. Class Hierarchy Visualization.
+* The result is a library with bare minimum abstraction, providing developers with maximum customizability.
+
+Similar to the PyTorch module, our Component provides excellent visualization of the pipeline structure.
+
+```
+SimpleQA(
+   (generator): Generator(
+      model_kwargs={'model': 'llama3-8b-8192'},
+      (prompt): Prompt(
+         template: <SYS>
+               You are a helpful assistant.
+               </SYS>
+               User: {{input_str}}
+               You:
+               , prompt_variables: ['input_str']
+      )
+      (model_client): GroqAPIClient()
+   )
+)
+```
+
+## Controllability
+
+Our simplicity did not come from doing 'less'.
+On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.
+
+* LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like Prompt, OutputParser, FunctionTool, and ToolManager.
+* Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ModelClient and Retriever.
+* All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.
+
+## Future of LLM Applications
+
+On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
+Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
+With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
+Researchers will want their code to be adapted into more products too.
diff --git a/docs/source/_static/class_hierarchy.html b/docs/source/_static/class_hierarchy.html
@@ -33,7 +33,7 @@ <h1></h1>
 
              #mynetwork {
                  width: 100%;
-                 height: 750px;
+                 height: 1000px;
                  background-color: #ffffff;
                  border: 1px solid lightgray;
                  position: relative;

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -25,8 +25,8 @@
 # -- Project information -----------------------------------------------------
 
 project = "LightRAG"
-copyright = "2024, SylphAI"
-author = "SylphAI"
+copyright = "2024, SylphAI, Inc"
+author = "SylphAI, Inc"
 
 # -- General configuration ---------------------------------------------------
 

diff --git a/docs/source/developer_notes/class_hierarchy.rst b/docs/source/developer_notes/class_hierarchy.rst
@@ -1,4 +1,4 @@
-LightRAG Class Hierarchy Visualization
+LightRAG Class Hierarchy
 =============================
 
 .. raw:: html

diff --git a/docs/source/get_started/installation.rst b/docs/source/get_started/installation.rst
@@ -1,41 +1,46 @@
 Installation
 ============
 
-LightRAG can be installed either as a package using pip or set up for development by cloning from GitHub. Follow the appropriate instructions below based on your needs.
+LightRAG is available in Python.
 
-Pip Installation
---------------------------------
+1. Install LightRAG
+~~~~~~~~~~~~~~~~~~~~
 
-For general users who simply want to use LightRAG, the easiest method is to install it directly via pip:
+To install the package, run:
 
 .. code-block:: bash
 
    pip install lightrag
 
-After installing the package, you need to set up your environment variables for the project to function properly:
 
-1. **Create an Environment File:**
 
-   Create a `.env` file in your project directory (where your scripts using LightRAG will run):
+2. Set up API keys
+~~~~~~~~~~~~~~~~~~~
 
-   .. code-block:: bash
+``.env`` file is recommended.
+You can have it at your project root directory.
+Here are an example:
 
-      touch .env
-      # Open .env and add necessary configurations such as API keys
+.. code-block:: bash
 
-2. **Configure Your `.env` File:**
+    OPENAI_API_KEY=YOUR_API_KEY_IF_YOU_USE_OPENAI
+    GROQ_API_KEY=YOUR_API_KEY_IF_YOU_USE_GROQ
+    ANTHROPIC_API_KEY=YOUR_API_KEY_IF_YOU_USE_ANTHROPIC
+    GOOGLE_API_KEY=YOUR_API_KEY_IF_YOU_USE_GOOGLE
+    COHERE_API_KEY=YOUR_API_KEY_IF_YOU_USE_COHERE
+    HF_TOKEN=YOUR_API_KEY_IF_YOU_USE_HF
 
-   Add the necessary API keys and other configurations required by LightRAG. This usually includes setting up credentials for accessing various APIs that LightRAG interacts with.
 
-3. **Load Environment Variables:**
+3. Load environment variables
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-   Make sure your application or scripts load the environment variables from the `.env` file at runtime. If you are using Python, libraries like `python-dotenv` can be used:
+You can add the following import:
 
-   .. code-block:: bash
+.. code-block:: python
 
-      pip install python-dotenv
+   from lightrag.utils import setup_env #noqa
 
-Then, in your Python script, ensure you load the variables:
+Or, you can load it yourself:
 
 .. code-block:: python
 
@@ -44,39 +49,63 @@ Then, in your Python script, ensure you load the variables:
 
 This setup ensures that LightRAG can access all necessary configurations during runtime.
 
+4. Install Optional Packages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+
+LightRAG currently has built-in support for (1) OpenAI, Groq, Anthropic, Google, and Cohere, (2) FAISS and Transformers.
+You can find all optional packages at :class:`utils.lazy_import.OptionalPackages`.
+Make sure to install the necessary SDKs for the components you plan to use.
+Here is the list of our tested versions:
+
+.. code-block::
+
+    openai = "^1.12.0"
+    groq = "^0.5.0"
+    faiss-cpu = "^1.8.0"
+    sqlalchemy = "^2.0.30"
+    cohere = "^5.5.8"
+    pgvector = "^0.2.5"
+    anthropic = "^0.26.0"
+    google-generativeai = "^0.5.4"
+
+
+
+
+
 
-Poetry Installation
---------------------------
+.. Poetry Installation
+.. --------------------------
 
-Developers and contributors who need access to the source code or wish to contribute to the project should set up their environment as follows:
+.. Developers and contributors who need access to the source code or wish to contribute to the project should set up their environment as follows:
 
-1. **Clone the Repository:**
+.. 1. **Clone the Repository:**
 
-   Start by cloning the LightRAG repository to your local machine:
+..    Start by cloning the LightRAG repository to your local machine:
 
-   .. code-block:: bash
+..    .. code-block:: bash
 
-      git clone https://github.com/SylphAI-Inc/LightRAG
-      cd LightRAG
+..       git clone https://github.com/SylphAI-Inc/LightRAG
+..       cd LightRAG
 
-2. **Configure API Keys:**
+.. 2. **Configure API Keys:**
 
-   Copy the example environment file and add your API keys:
+..    Copy the example environment file and add your API keys:
 
-   .. code-block:: bash
+..    .. code-block:: bash
 
-      cp .env.example .env
-      # Open .env and fill in your API keys
+..       cp .env.example .env
+..       # Open .env and fill in your API keys
 
-3. **Install Dependencies:**
+.. 3. **Install Dependencies:**
 
-   Use Poetry to install the dependencies and set up the virtual environment:
+..    Use Poetry to install the dependencies and set up the virtual environment:
 
-   .. code-block:: bash
+..    .. code-block:: bash
 
-      poetry install
-      poetry shell
+..       poetry install
+..       poetry shell
 
-4. **Verification:**
+.. 4. **Verification:**
 
-   Now, you should be able to run any file within the repository or execute tests to confirm everything is set up correctly.
+..    Now, you should be able to run any file within the repository or execute tests to confirm everything is set up correctly.
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -2,11 +2,11 @@
 Introduction
 =======================
 
-
-LightRAG is the "PyTorch" library for building large langage model(LLM) applications. We help developers on both building and optimimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
+LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
 It is light, modular, and robust.
 
 
+
 .. grid:: 1
    :gutter: 1
 
@@ -69,26 +69,27 @@ It is light, modular, and robust.
 
 
 
+.. and Customizability
 
-Clarity and Simplicity
+
+Simplicity
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
-As a library, we provide them the foundamental building blocks with 100% clarity and simplicity.
-
-.. We support them with require **Maximum Flexibility and Customizability**:
-
-.. Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:
+As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.
 
-- Two foundamental and powerful base classes: ``component`` for the pipeline and ``DataClass`` for the data interaction with LLMs.
+- Two fundamental and powerful base classes: `Component` for the pipeline and `DataClass` for data interaction with LLMs.
 - We end up with less than two levels of subclasses. :doc:`developer_notes/class_hierarchy`.
-- The result is a library with bare minimum abstraction with maximum flexibility and customizability.
+- The result is a library with bare minimum abstraction, providing developers with maximum customizability.
 
 .. - We use 10X less code than other libraries to achieve 10X more robustness and flexibility.
 
 .. - `Class Hierarchy Visualization <developer_notes/class_hierarchy.html>`_
+.. We support them with require **Maximum Flexibility and Customizability**:
 
-Similar to PyTorch module, our ``component`` gives us a great visualization on the pipeline structure.
+.. Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:
+
+Similar to the `PyTorch` module, our ``Component`` provides excellent visualization of the pipeline structure.
 
 .. code-block::
 
@@ -107,13 +108,24 @@ Similar to PyTorch module, our ``component`` gives us a great visualization on t
       )
    )
 
-Control and Transparency
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. and Robustness
+
+Controllability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Our simplicity did not come from doing 'less'.
+On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.
+
+- LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like ``Prompt``, ``OutputParser``, ``FunctionTool``, and ``ToolManager``.
+- Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ``ModelClient`` and ``Retriever``.
+- All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.
+
+
+
+.. Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:
 
-Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:
+.. - LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
 
-- LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
-- LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format
+.. - LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format
 
 It is the future of LLM applications
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~