Llamapp.

Run RAG locally. Completely.

Overview

This project implements a Retrieval Augmented Generator (RAG) that operates entirely locally, combining document retrieval and language model generation to provide accurate and contextually relevant responses. It leverages Langchainjs. By leveraging a custom retrieval mechanism and the IBM Granite3-moe:1b [https://ibm.com/granite3-moe] language model, the system minimizes hallucinations and enhances response quality. This is achieved via Reciprocal Rank Fusion.

Why RAG and why RAG Fusion?

RAG-Fusion, a search methodology that aims to bridge the gap between traditional search paradigms and the multifaceted dimensions of human queries. Inspired by the capabilities of Retrieval Augmented Generation (RAG), this project goes a step further by employing multiple query generation and Reciprocal Rank Fusion to re-rank search results. The overarching goal is to move closer to unearthing that elusive 90% of transformative knowledge that often remains hidden behind top search results.

Objectives

Accurate Information Retrieval: Utilize vector embeddings and custom retrieval algorithms to fetch the most relevant documents.
Enhanced Response Generation: Generate responses using the context of retrieved documents to ensure accuracy and relevance. More importantly, traceability.
Local Operation: Ensure all components run locally, providing privacy and control over the data and processing.

Prerequisites

Node.js
Redis Stack Server [https://redis.io/about/about-stack/]
IBM Granite3-moe:1b & nomic embed text models [ibm.com]
System with minimum 8GB RAM and mid performance processor. (This was developed and runs smoothly on Macbook Air with M3 chipset)

Project Structure

The project consists of the following key components:

Document Embedding and Storage:

- **Embeddings:** Nomic text encoding for creating embeddings. [https://ollama.com/library/nomic-embed-text]
- **Vector Store:** Redis vector store with indices for different types of documents (API and Docs).

Chains and Processes:

- **Document Loading:** Parsing and storing documents based on their type.
- **User Input Handling:** Rewriting user input for clarity and focus.
- **Document Retrieval:** Custom retrieval based on similarity search and scoring with different thresholds for different indices.
- **Reciprocal Rank Fusion (RRF):** Algorithm to determine the most relevant documents from alternative queries.
- **Response Generation:** Using the retrieved documents as context to generate responses, avoiding hallucinations.

Key Classes and Functions:
- RelevantDocumentsRetriever: Custom retriever to fetch relevant documents from Redis vector stores.
- generateQueries: Generate alternate queries based on the user's original query.
- fusion: Apply the RRF algorithm to rank and filter relevant documents.
- reciprocalRankFusion: Implementation of the RRF algorithm to combine and rank document relevance.

Getting Started

Prerequisites

Node.js: Ensure you have Node.js installed.
Redis: Set up a local instance of Redis.
IBM Model: Download and configure the IBM Granite3-moe:1b model for local usage.

Installation

Clone the repository:

git clone https://github.com/rajatasususual/llamapp.git
cd llamapp

Install dependencies:
```
npm install
```

Set up environment variables:

Create a .env file in the root directory with the following contents:

#APP CONFIG
PORT=3000

# MAIN SWITCHES
LOAD_DOCS=false
REWRITE=true
FUSION=true

#CHAT MODELS
EMBEDDING_MODEL="nomic-embed-text"
BASE_URL="http://localhost:11434"
CHAT_MODEL="granite3-moe:1b"
CHAT_TEMPERATURE=0

#DATASTORE
REDIS_HOST="redis://localhost:"
REDIS_PORT="6379"

#SEARCH SENSITIVITY
L2_INDEX_THRESHOLD=250
COSINE_INDEX_THRESHOLD=0.25
FUSION_THRESHOLD=0.1

Running the Project

You can run the project in two different modes: CLI and UI.

CLI Mode

Start Redis:

Ensure your Redis server is running. You can start Redis with:
```
redis-stack-server
```

Load Documents:

Use the provided functions to load documents into the vector stores:

import { RedisVectorStore } from "@langchain/redis";
import { loadDocuments } from "./main";

const documentStore = new RedisVectorStore(/* your configuration */);
const apiStore = new RedisVectorStore(/* your configuration */);

// Load documents into the vector stores
await loadDocuments(documentStore, 'path/to/your/html/documents');
await loadDocuments(apiStore, 'path/to/your/json/documents');

It is important that you load plenty of documents to ensure dense search works its magic. We are working with a LLM trained to run with minimum footprint and hence needs all the support it can.

Run the CLI Application:
```
npm run cli
```

UI Mode

this project is moved to a standalone setup. Read more about the application in llamapp-web[www.github.com/rajatasusual/llamapp-web]

Start Redis:

Ensure your Redis server is running. You can start Redis with:
```
redis-stack-server
```

Note: For vector storage, you need redis-stack-server.

Load Documents:

Use the provided functions to load documents into the vector stores as described in the CLI section.
Start the Express Server:
```
npm start
```
This will run the serve.ts file, which contains an Express.js setup that runs the engine locally for an application to communicate.
Access the Application:

Open your browser and navigate to the local server (e.g., http://localhost:3000) to interact with the application through the UI.

Example Usage

Here's a brief example of how you can use the RAG engine to process a user query and generate a response:

import { respond } from "./main";

// Example query
const question = "How do I set up a local server?";

// Get response
respond(question).then(response => {
     console.log("Response:", response);
});

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue for any improvements or bugs you find.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
assets		assets
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llamapp.

Run RAG locally. Completely.

Overview

Why RAG and why RAG Fusion?

Objectives

Prerequisites

Project Structure

Getting Started

Prerequisites

Installation

Running the Project

CLI Mode

UI Mode

Example Usage

Contributing

License

About

Contributors 2

Languages

rajatasusual/llamapp

Folders and files

Latest commit

History

Repository files navigation

Llamapp.

Run RAG locally. Completely.

Overview

Why RAG and why RAG Fusion?

Objectives

Prerequisites

Project Structure

Getting Started

Prerequisites

Installation

Running the Project

CLI Mode

UI Mode

Example Usage

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages