This is a custom implementation of ChatGPT + LangChain + 700+ documents of the law to be accessed as an endpoint API hosted in Azure. Chat experience is optimized for the accuracy of responses at the expense of speed.
pip install -r requirements.txt
Overview of important libraries:
- LangChain
- ChatGPT key
- Python 3.8+
- Azure Subscription
- Get embeddings through
get_embeddings.py
and place them in a folderembeddings
at the root. - Install requirements.
- Create a .env.prod file with respective keys.
- Configure your AzureML environment (workspace, compute instances, etc.)
- Run
deployment.py
to deploy to AzureML endpoint.
- Embeddings were created using 512 tokens, with overlapping windows of 40 tokens, and the ADA-002 model from OpenAI.
- Pre-processing steps are kept in a different repository.
- Hybrid search consists of latent-based + keyword-based searches. Results are re-ranked using an out-of-the-box re-ranker (hosted in Azure). Hybrid search is optimized for the most accurate responses at the expense of speed.