Skip to content

Commit

Permalink
sync with main
Browse files Browse the repository at this point in the history
  • Loading branch information
Alleria1809 committed Jul 2, 2024
2 parents 0c51f7d + 78ed2aa commit 01ec442
Show file tree
Hide file tree
Showing 270 changed files with 15,031 additions and 4,564 deletions.
4 changes: 4 additions & 0 deletions .env_example
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
OPENAI_API_KEY=YOUR_API_KEY_IF_YOU_USE_OPENAI
GROQ_API_KEY=YOUR_API_KEY_IF_YOU_USE_GROQ
ANTHROPIC_API_KEY=YOUR_API_KEY_IF_YOU_USE_ANTHROPIC
GOOGLE_API_KEY=YOUR_API_KEY_IF_YOU_USE_GOOGLE
COHERE_API_KEY=YOUR_API_KEY_IF_YOU_USE_COHERE
HF_TOKEN=YOUR_API_KEY_IF_YOU_USE_HF
68 changes: 68 additions & 0 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: Documentation

on:
push:
branches:
- xiaoyi_doc # Ensure this is the branch where you commit documentation updates

permissions:
contents: write
actions: read

jobs:
build-and-deploy:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies using Poetry
run: |
poetry config virtualenvs.create false
poetry install
- name: Build documentation using Makefile
run: |
echo "Building documentation from: $(pwd)"
ls -l # Debug: List current directory contents
poetry run make -C docs html
working-directory: ${{ github.workspace }}

- name: List built documentation
run: |
find ./build/ -type f
working-directory: ${{ github.workspace }}/docs

- name: Create .nojekyll file
run: |
touch .nojekyll
working-directory: ${{ github.workspace }}/docs/build

- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_branch: gh-pages
publish_dir: ./docs/build/
user_name: github-actions[bot]
user_email: github-actions[bot]@users.noreply.github.com

# - name: Debug Output
# run: |
# pwd # Print the current working directory
# ls -l # List files in the build directory
# cat ./source/conf.py # Show Sphinx config file for debugging
# working-directory: ${{ github.workspace }}/docs/build
67 changes: 67 additions & 0 deletions .github/workflows/documentation_li.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
name: Documentation

on:
push:
branches:
- li # Ensure this is the branch where you commit documentation updates

permissions:
contents: write
actions: read

jobs:
build-and-deploy:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies using Poetry
run: |
poetry config virtualenvs.create false
poetry install
- name: Build documentation using Makefile
run: |
echo "Building documentation from: $(pwd)"
ls -l # Debug: List current directory contents
poetry run make -C docs html
working-directory: ${{ github.workspace }}

- name: List built documentation
run: |
find ./build/ -type f
working-directory: ${{ github.workspace }}/docs

- name: Create .nojekyll file
run: |
touch .nojekyll
working-directory: ${{ github.workspace }}/docs/build

- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_branch: gh-pages
publish_dir: ./docs/build/
user_name: github-actions[bot]
user_email: github-actions[bot]@users.noreply.github.com
# - name: Debug Output
# run: |
# pwd # Print the current working directory
# ls -l # List files in the build directory
# cat ./source/conf.py # Show Sphinx config file for debugging
# working-directory: ${{ github.workspace }}/docs/build
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,6 @@ traces/
*.log
storage/
*.pkl
/*.png
/*.dot
/*.svg
9 changes: 7 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,16 @@ repos:
rev: 24.4.2
hooks:
- id: black
args: ["--line-length=88"]
args: ['--line-length=88']

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.2
hooks:
# Run the linter.
- id: ruff
args: ["--fix", "--extend-ignore=E402"]
args: ['--fix', '--extend-ignore=E402']
# - repo: https://github.com/pycqa/flake8
# rev: 4.0.1
# hooks:
# - id: flake8
# args: ['--max-line-length=88']
103 changes: 103 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Introduction

LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
It is light, modular, and robust.

**PyTorch**

```python
import torch
import torch.nn as nn

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = self.dropout1(x)
x = self.dropout2(x)
x = self.fc1(x)
return self.fc2(x)
```

**LightRAG**

```python

from lightrag.core import Component, Generator
from lightrag.components.model_client import GroqAPIClient
from lightrag.utils import setup_env #noqa

class SimpleQA(Component):
def __init__(self):
super().__init__()
template = r"""<SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
"""
self.generator = Generator(
model_client=GroqAPIClient(),
model_kwargs={"model": "llama3-8b-8192"},
template=template,
)

def call(self, query):
return self.generator({"input_str": query})

async def acall(self, query):
return await self.generator.acall({"input_str": query})
```

## Simplicity

Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.

* Two fundamental and powerful base classes: Component for the pipeline and DataClass for data interaction with LLMs.
* We end up with less than two levels of subclasses. Class Hierarchy Visualization.
* The result is a library with bare minimum abstraction, providing developers with maximum customizability.

Similar to the PyTorch module, our Component provides excellent visualization of the pipeline structure.

```
SimpleQA(
(generator): Generator(
model_kwargs={'model': 'llama3-8b-8192'},
(prompt): Prompt(
template: <SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
, prompt_variables: ['input_str']
)
(model_client): GroqAPIClient()
)
)
```

## Controllability

Our simplicity did not come from doing 'less'.
On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.

* LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like Prompt, OutputParser, FunctionTool, and ToolManager.
* Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ModelClient and Retriever.
* All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.

## Future of LLM Applications

On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
Researchers will want their code to be adapted into more products too.
File renamed without changes.
Loading

0 comments on commit 01ec442

Please sign in to comment.