Skip to content

Commit

Permalink
Rebase on main
Browse files Browse the repository at this point in the history
  • Loading branch information
liyin2015 committed Jul 9, 2024
2 parents fcd531e + 61245d1 commit cd066dc
Show file tree
Hide file tree
Showing 16 changed files with 408 additions and 201 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Documentation
on:
push:
branches:
- li # Trigger the workflow when changes are pushed to the release branch
- release # Trigger the workflow when changes are pushed to the release branch

permissions:
contents: write
Expand Down
24 changes: 15 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,21 @@
### ⚡ The PyTorch Library for Large Language Model Applications ⚡

*LightRAG* helps developers with both building and optimizing *Retriever-Agent-Generator (RAG)* pipelines.
It is *light*, *modular*, and *robust*.
It is *light*, *modular*, and *robust*, with a 100% readable codebase.




# Design Philosophy
# Why LightRAG?

LLMs are like water; they can almost do anything, from GenAI applications such as chatbots, translation, summarization, code generation, and autonomous agents to classical NLP tasks like text classification and named entity recognition. They interact with the world beyond the model’s internal knowledge via retrievers, memory, and tools (function calls). Each use case is unique in its data, business logic, and user experience.

Because of this, no library can provide out-of-the-box solutions. Users must build towards their own use case, which requires the library to be modular, robust, and have a clean and readable codebase. The only code you should put into production is code you either trust 100% or are 100% clear about how to customize and iterate.

LightRAG is born to be light, modular, and robust, with a 100% readable codebase.

Further reading: [Design Philosophy](https://lightrag.sylph.ai/developer_notes/lightrag_design_philosophy.html) and [Class hierarchy](https://lightrag.sylph.ai/developer_notes/class_hierarchy.html).

LightRAG follows three fundamental principles from day one: simplicity over complexity, quality over quantity, and optimizing over building.
This design philosophy results in a library with bare minimum abstraction, providing developers with maximum customizability. View Class hierarchy [here](https://lightrag.sylph.ai/developer_notes/class_hierarchy.html).

<!--
Expand Down Expand Up @@ -64,8 +70,8 @@ from lightrag.components.output_parsers import JsonOutputParser

@dataclass
class QAOutput(DataClass):
explaination: str = field(
metadata={"desc": "A brief explaination of the concept in one sentence."}
explanation: str = field(
metadata={"desc": "A brief explanation of the concept in one sentence."}
)
example: str = field(metadata={"desc": "An example of the concept in a sentence."})

Expand Down Expand Up @@ -129,7 +135,7 @@ QA(
</OUTPUT_FORMAT>
</SYS>
User: {{input_str}}
You:, prompt_kwargs: {'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\n```\n{\n "explaination": "A brief explaination of the concept in one sentence. (str) (required)",\n "example": "An example of the concept in a sentence. (str) (required)"\n}\n```\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\n-Use double quotes for the keys and string values.\n-Follow the JSON formatting conventions.'}, prompt_variables: ['output_format_str', 'input_str']
You:, prompt_kwargs: {'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\n```\n{\n "explanation": "A brief explanation of the concept in one sentence. (str) (required)",\n "example": "An example of the concept in a sentence. (str) (required)"\n}\n```\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\n-Use double quotes for the keys and string values.\n-Follow the JSON formatting conventions.'}, prompt_variables: ['output_format_str', 'input_str']
)
(model_client): GroqAPIClient()
(output_processors): JsonOutputParser(
Expand Down Expand Up @@ -160,7 +166,7 @@ QA(
Here is what we get from ``print(output)``:

```
GeneratorOutput(data=QAOutput(explaination='LLM stands for Large Language Model, which refers to a type of artificial intelligence designed to process and generate human-like language.', example='For instance, LLMs are used in chatbots and virtual assistants, such as Siri and Alexa, to understand and respond to natural language input.'), error=None, usage=None, raw_response='```\n{\n "explaination": "LLM stands for Large Language Model, which refers to a type of artificial intelligence designed to process and generate human-like language.",\n "example": "For instance, LLMs are used in chatbots and virtual assistants, such as Siri and Alexa, to understand and respond to natural language input."\n}', metadata=None)
GeneratorOutput(data=QAOutput(explanation='LLM stands for Large Language Model, which refers to a type of artificial intelligence designed to process and generate human-like language.', example='For instance, LLMs are used in chatbots and virtual assistants, such as Siri and Alexa, to understand and respond to natural language input.'), error=None, usage=None, raw_response='```\n{\n "explanation": "LLM stands for Large Language Model, which refers to a type of artificial intelligence designed to process and generate human-like language.",\n "example": "For instance, LLMs are used in chatbots and virtual assistants, such as Siri and Alexa, to understand and respond to natural language input."\n}', metadata=None)
```
**See the prompt**

Expand All @@ -184,7 +190,7 @@ You are a helpful assistant.
Your output should be formatted as a standard JSON instance with the following schema:
```
{
"explaination": "A brief explaination of the concept in one sentence. (str) (required)",
"explanation": "A brief explanation of the concept in one sentence. (str) (required)",
"example": "An example of the concept in a sentence. (str) (required)"
}
```
Expand Down
68 changes: 59 additions & 9 deletions developer_notes/logging_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,32 +90,82 @@ def use_only_child_logger():
child_logger.info(f"output using app logger {__name__}: {output}")


def use_native_logging():
def user_program_lightrag_config():
from lightrag.utils.logger import get_logger

# use it with root logger
root_logger = get_logger(
log = get_logger(
name=__name__,
level="INFO",
enable_console=True,
enable_file=False,
enable_file=True,
filename="app.log",
)
log.info("This is a user program child logger in app.log")


# native logging will inherit the configuration of the root logger
# similar to config 1 where we only use root logger
def user_program():
import logging

log = logging.getLogger(__name__)
log.info("test native logging")
log.info("This is a user program child logger")


def use_native_root_logging():
# usually the main program will use the root logger to gather all logs

# use it with root logger
root_logger = get_logger(
level="INFO",
enable_console=True,
enable_file=True,
)

# call user program
user_program()

root_logger.info("test root logger")
generator = Generator.from_config(generator_config)
output = generator(prompt_kwargs={"input_str": "how are you?"})
root_logger.info(f"output using root logger: {output}")


def use_named_logger():
# use it with named logger
named_logger = get_logger(
name="app1",
level="INFO",
enable_console=True,
enable_file=True,
)

user_program()
# will only include logs here
named_logger.info("test named logger")
generator = Generator.from_config(generator_config)
output = generator(prompt_kwargs={"input_str": "how are you?"})
named_logger.info(f"output using named logger {__name__}: {output}")


def use_lightrag_logger():
# set up root logger for the library
# get_logger(
# level="INFO",
# enable_file=True,
# filename="app.log",
# )
# # use the library
# generator = Generator.from_config(generator_config)
# output = generator(prompt_kwargs={"input_str": "how are you?"})
# use a logger for the user program
user_program_lightrag_config()


if __name__ == "__main__":
# check_console_logging()
# get_logger_and_enable_library_logging_in_same_file()
separate_library_logging_and_app_logging()
# separate_library_logging_and_app_logging()
# use_only_child_logger()
use_native_logging()
# use_native_root_logging()
# use_named_logger()
use_lightrag_logger()
printc("All logging examples are done. Feeling green!", color="green")
3 changes: 3 additions & 0 deletions docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
--bs-gray-500:#adb5bd;


}
.theme-switch-button {
display: none;
}
.theme-version {
display: none;
Expand Down
2 changes: 1 addition & 1 deletion docs/source/apis/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
API Reference
=============

Welcome to the `LightRAG`.
Welcome to `LightRAG`.
The API reference is organized by subdirectories.

.. This section provides detailed documentation of the internal APIs that make up the LightRAG framework. Explore the APIs to understand how to effectively utilize and integrate LightRAG components into your projects.
Expand Down
88 changes: 70 additions & 18 deletions docs/source/developer_notes/logging.rst
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
Logging
====================

Python logging module [1]_ is a powerful and flexible tool for debugging and tracing.
LightRAG uses the native ``logging`` module as the *first line of defense*.

The Python logging module [1]_ is a powerful and flexible tool for debugging and tracing.
LightRAG uses the native logging module as the *first line of defense*.

Design
--------------------
Some libraries may use ``hooks`` [2]_ and ``Callbacks`` [3]_ [4]_, or advanced web-based debugging tools [5]_ [6]_ [7]_.
``hooks`` and ``callbacks`` are conceptually similar in that they both allow users to execute custom code at specific points during the execution of a program.
Both provide mechanisms to inject additional behavior in response to certain events or conditions, without modifying its core logic.
Some libraries may use hooks [2]_ and callbacks [3]_ [4]_, or advanced web-based debugging tools [5]_ [6]_ [7]_.
Hooks and callbacks are conceptually similar in that they both allow users to execute custom code at specific points during the execution of a program.
Both provide mechanisms to inject additional behavior in response to certain events or conditions, without modifying the core logic.
PyTorch defines, registers, and executes hooks mainly in its base classes like `nn.Module` and `Tensor`, without polluting the functional and user-facing APIs.

At this point, our objectives are:

1. Maximize the debugging capabilities via the simple logging module to keep the source code clean.
1. Maximize debugging capabilities via the simple logging module to keep the source code clean.
2. Additionally, as we can't always control the outputs of generators, we will provide customized logger and tracers(drop-in decorators) for them, for which we will explain in :doc:`logging_tracing`. This will not break the first objective.

In the future, when we have more complex requirements from users, we will consider adding hooks/callbacks but doing it in a way to keep the functional and user-facing APIs clean.
In the future, when we have more complex requirements from users, we will consider adding hooks/callbacks but we will do it in a way to keep the functional and user-facing APIs clean.

How the library logs
~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -28,8 +29,8 @@ In each file, we simply set the logger with the following code:
log = logging.getLogger(__name__)
And we will use `log` and decide what level of logging we want to use in each function.
Here is how :ref:`Generator logs<core.generator>`.
We then use `log` and decide what level of logging we want to use in each function.
Here is how :ref:`Generator <core.generator>` logs.

How users set up the logger
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -94,19 +95,70 @@ The output will be:
Use Logger in Projects
-------------------------
There are three distinct ways to set up the logging in your project:
There are two distinct ways to set up the logging in your project:

1. Have both the library loggind and your application logging in a single file. This is the simplest setup.
2. Use both root and named logger to log library and application logs separately.

Set up all logs in one file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assume your source code is at `src/task.py`. You can log simply by:

.. code-block:: python
import logging
log = logging.getLogger(__name__)
class Task:
def __init__(self):
log.info("This is a user program child logger")
In the main file, you can config a single root logger to log both library and application logs:

.. code-block:: python
import logging
from lightrag.utils.logger import get_logger
root_logger = get_logger(level="DEBUG", save_dir="./logs") # log to ./logs/lib.log
# run code from the library components such as generator
# ....
root_logger.info("This is the log in the main file")
This way, all logs will be saved in `./logs/lib.log`.

Separate library and application logs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In some cases, if users prefer to separate the library and application logs, they can use a named logger.

In the user program, such as at `src/task.py`, you can set up a named logger and logs to `./logs/my_app.log`:

.. code-block:: python
from lightrag.utils.logger import get_logger
app_logger = get_logger(name="my_app", level="DEBUG", save_dir="./logs") # log to ./logs/my_app.log
class Task:
def __init__(self):
app_logger.info("This is a user program child logger")
The difference is that you have already attached handlers to the app_logger.
In the main file, you do not need to set up a root logger to enable your application logs.
However, you can still set up a root logger to log the library logs separately if needed, and create another named logger to continue logging in the main file.


1. Use root logger only and have all the library and your application logging in one file. This is the simplest setup.
2. Use only named logger to log your application logs in a file.
3. Use both root and named logger to log library and application logs separately.

It works similarly if it is to be logged to console.
Config 3 can be quite neat:
.. It works similarly if it is to be logged to console.
.. Config 3 can be quite neat:
- You can enable different levels of logging for the library and your application.
- You can easily focus on debugging your own code without being distracted by the library logs and still have the option to see the library logs if needed.
.. - You can enable different levels of logging for the library and your application.
.. - You can easily focus on debugging your own code without being distracted by the library logs and still have the option to see the library logs if needed.
.. Create a named logger
.. .. Create a named logger
.. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. .. code-block:: python
Expand Down
Loading

0 comments on commit cd066dc

Please sign in to comment.