This project implements a modular retrieval-augmented generation (RAG) pipeline using various components for data processing, vector storage, and answer generation..
-
Create a virtual environment:
python -m venv .venv
-
Activate the virtual environment:
-
On macOS and Linux:
source .venv/bin/activate
-
On Windows:
.venv\Scripts\activate
-
-
Install the required packages:
pip install uv
uv sync
To run the application using Streamlit:
streamlit run ui/app.py
To run the tests located in the test directory: pytest test/
You can modify the default configuration by editing the config.yaml file. This file allows you to set different configurations for the Modular RAG Pipeline components
Check the existing config.yaml to help you understand.
You can create your own components by following the structure of the existing base components in the components directory. They will automatically be populated in the streamlit UI for you to select. You can also just include them in the config.yaml file to use them.