Skip to content

Commit

Permalink
Merge pull request #59 from SylphAI-Inc/li
Browse files Browse the repository at this point in the history
Li
  • Loading branch information
liyin2015 authored Jun 30, 2024
2 parents a865b1d + 1260642 commit 2a5a260
Show file tree
Hide file tree
Showing 7 changed files with 204 additions and 57 deletions.
4 changes: 4 additions & 0 deletions .env_example
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
OPENAI_API_KEY=YOUR_API_KEY_IF_YOU_USE_OPENAI
GROQ_API_KEY=YOUR_API_KEY_IF_YOU_USE_GROQ
ANTHROPIC_API_KEY=YOUR_API_KEY_IF_YOU_USE_ANTHROPIC
GOOGLE_API_KEY=YOUR_API_KEY_IF_YOU_USE_GOOGLE
COHERE_API_KEY=YOUR_API_KEY_IF_YOU_USE_COHERE
HF_TOKEN=YOUR_API_KEY_IF_YOU_USE_HF
102 changes: 102 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Introduction

LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
It is light, modular, and robust.

**PyTorch**

```python
import torch
import torch.nn as nn

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = self.dropout1(x)
x = self.dropout2(x)
x = self.fc1(x)
return self.fc2(x)

**LightRAG**

```python

from lightrag.core import Component, Generator
from lightrag.components.model_client import GroqAPIClient
from lightrag.utils import setup_env #noqa

class SimpleQA(Component):
def __init__(self):
super().__init__()
template = r"""<SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
"""
self.generator = Generator(
model_client=GroqAPIClient(),
model_kwargs={"model": "llama3-8b-8192"},
template=template,
)

def call(self, query):
return self.generator({"input_str": query})

async def acall(self, query):
return await self.generator.acall({"input_str": query})
```

## Simplicity

Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.

* Two fundamental and powerful base classes: Component for the pipeline and DataClass for data interaction with LLMs.
* We end up with less than two levels of subclasses. Class Hierarchy Visualization.
* The result is a library with bare minimum abstraction, providing developers with maximum customizability.

Similar to the PyTorch module, our Component provides excellent visualization of the pipeline structure.

```
SimpleQA(
(generator): Generator(
model_kwargs={'model': 'llama3-8b-8192'},
(prompt): Prompt(
template: <SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
, prompt_variables: ['input_str']
)
(model_client): GroqAPIClient()
)
)
```

## Controllability

Our simplicity did not come from doing 'less'.
On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.

* LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like Prompt, OutputParser, FunctionTool, and ToolManager.
* Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ModelClient and Retriever.
* All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.

## Future of LLM Applications

On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
Researchers will want their code to be adapted into more products too.
2 changes: 1 addition & 1 deletion docs/source/_static/class_hierarchy.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ <h1></h1>

#mynetwork {
width: 100%;
height: 750px;
height: 1000px;
background-color: #ffffff;
border: 1px solid lightgray;
position: relative;
Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
# -- Project information -----------------------------------------------------

project = "LightRAG"
copyright = "2024, SylphAI"
author = "SylphAI"
copyright = "2024, SylphAI, Inc"
author = "SylphAI, Inc"

# -- General configuration ---------------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion docs/source/developer_notes/class_hierarchy.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
LightRAG Class Hierarchy Visualization
LightRAG Class Hierarchy
=============================

.. raw:: html
Expand Down
103 changes: 66 additions & 37 deletions docs/source/get_started/installation.rst
Original file line number Diff line number Diff line change
@@ -1,41 +1,46 @@
Installation
============

LightRAG can be installed either as a package using pip or set up for development by cloning from GitHub. Follow the appropriate instructions below based on your needs.
LightRAG is available in Python.

Pip Installation
--------------------------------
1. Install LightRAG
~~~~~~~~~~~~~~~~~~~~

For general users who simply want to use LightRAG, the easiest method is to install it directly via pip:
To install the package, run:

.. code-block:: bash
pip install lightrag
After installing the package, you need to set up your environment variables for the project to function properly:
1. **Create an Environment File:**
Create a `.env` file in your project directory (where your scripts using LightRAG will run):
2. Set up API keys
~~~~~~~~~~~~~~~~~~~

.. code-block:: bash
``.env`` file is recommended.
You can have it at your project root directory.
Here are an example:

touch .env
# Open .env and add necessary configurations such as API keys
.. code-block:: bash
2. **Configure Your `.env` File:**
OPENAI_API_KEY=YOUR_API_KEY_IF_YOU_USE_OPENAI
GROQ_API_KEY=YOUR_API_KEY_IF_YOU_USE_GROQ
ANTHROPIC_API_KEY=YOUR_API_KEY_IF_YOU_USE_ANTHROPIC
GOOGLE_API_KEY=YOUR_API_KEY_IF_YOU_USE_GOOGLE
COHERE_API_KEY=YOUR_API_KEY_IF_YOU_USE_COHERE
HF_TOKEN=YOUR_API_KEY_IF_YOU_USE_HF
Add the necessary API keys and other configurations required by LightRAG. This usually includes setting up credentials for accessing various APIs that LightRAG interacts with.
3. **Load Environment Variables:**
3. Load environment variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Make sure your application or scripts load the environment variables from the `.env` file at runtime. If you are using Python, libraries like `python-dotenv` can be used:
You can add the following import:

.. code-block:: bash
.. code-block:: python
pip install python-dotenv
from lightrag.utils import setup_env #noqa
Then, in your Python script, ensure you load the variables:
Or, you can load it yourself:

.. code-block:: python
Expand All @@ -44,39 +49,63 @@ Then, in your Python script, ensure you load the variables:
This setup ensures that LightRAG can access all necessary configurations during runtime.

4. Install Optional Packages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


LightRAG currently has built-in support for (1) OpenAI, Groq, Anthropic, Google, and Cohere, (2) FAISS and Transformers.
You can find all optional packages at :class:`utils.lazy_import.OptionalPackages`.
Make sure to install the necessary SDKs for the components you plan to use.
Here is the list of our tested versions:

.. code-block::
openai = "^1.12.0"
groq = "^0.5.0"
faiss-cpu = "^1.8.0"
sqlalchemy = "^2.0.30"
cohere = "^5.5.8"
pgvector = "^0.2.5"
anthropic = "^0.26.0"
google-generativeai = "^0.5.4"
Poetry Installation
--------------------------
.. Poetry Installation
.. --------------------------
Developers and contributors who need access to the source code or wish to contribute to the project should set up their environment as follows:
.. Developers and contributors who need access to the source code or wish to contribute to the project should set up their environment as follows:
1. **Clone the Repository:**
.. 1. **Clone the Repository:**
Start by cloning the LightRAG repository to your local machine:
.. Start by cloning the LightRAG repository to your local machine:
.. code-block:: bash
.. .. code-block:: bash
git clone https://github.com/SylphAI-Inc/LightRAG
cd LightRAG
.. git clone https://github.com/SylphAI-Inc/LightRAG
.. cd LightRAG
2. **Configure API Keys:**
.. 2. **Configure API Keys:**
Copy the example environment file and add your API keys:
.. Copy the example environment file and add your API keys:
.. code-block:: bash
.. .. code-block:: bash
cp .env.example .env
# Open .env and fill in your API keys
.. cp .env.example .env
.. # Open .env and fill in your API keys
3. **Install Dependencies:**
.. 3. **Install Dependencies:**
Use Poetry to install the dependencies and set up the virtual environment:
.. Use Poetry to install the dependencies and set up the virtual environment:
.. code-block:: bash
.. .. code-block:: bash
poetry install
poetry shell
.. poetry install
.. poetry shell
4. **Verification:**
.. 4. **Verification:**
Now, you should be able to run any file within the repository or execute tests to confirm everything is set up correctly.
.. Now, you should be able to run any file within the repository or execute tests to confirm everything is set up correctly.
44 changes: 28 additions & 16 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
Introduction
=======================


LightRAG is the "PyTorch" library for building large langage model(LLM) applications. We help developers on both building and optimimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
It is light, modular, and robust.



.. grid:: 1
:gutter: 1

Expand Down Expand Up @@ -69,26 +69,27 @@ It is light, modular, and robust.
.. and Customizability
Clarity and Simplicity
Simplicity
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
As a library, we provide them the foundamental building blocks with 100% clarity and simplicity.

.. We support them with require **Maximum Flexibility and Customizability**:
.. Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:
As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.

- Two foundamental and powerful base classes: ``component`` for the pipeline and ``DataClass`` for the data interaction with LLMs.
- Two fundamental and powerful base classes: `Component` for the pipeline and `DataClass` for data interaction with LLMs.
- We end up with less than two levels of subclasses. :doc:`developer_notes/class_hierarchy`.
- The result is a library with bare minimum abstraction with maximum flexibility and customizability.
- The result is a library with bare minimum abstraction, providing developers with maximum customizability.

.. - We use 10X less code than other libraries to achieve 10X more robustness and flexibility.
.. - `Class Hierarchy Visualization <developer_notes/class_hierarchy.html>`_
.. We support them with require **Maximum Flexibility and Customizability**:
Similar to PyTorch module, our ``component`` gives us a great visualization on the pipeline structure.
.. Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:
Similar to the `PyTorch` module, our ``Component`` provides excellent visualization of the pipeline structure.

.. code-block::
Expand All @@ -107,13 +108,24 @@ Similar to PyTorch module, our ``component`` gives us a great visualization on t
)
)
Control and Transparency
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. and Robustness
Controllability
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Our simplicity did not come from doing 'less'.
On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.

- LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like ``Prompt``, ``OutputParser``, ``FunctionTool``, and ``ToolManager``.
- Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ``ModelClient`` and ``Retriever``.
- All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.



.. Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:
Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:
.. - LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
- LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
- LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format
.. - LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format
It is the future of LLM applications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down

0 comments on commit 2a5a260

Please sign in to comment.