Skip to content

aigc-apps/PAI-RAG

Repository files navigation

PAI-RAG: An easy-to-use framework for modular RAG

PAI-RAG CI Build

English | 简体中文 |

📕 Contents

💡 What is PAI-RAG?

PAI-RAG is an easy-to-use opensource framework for modular RAG (Retrieval-Augmented Generation). It combines LLM (Large Language Model) to provide truthful question-answering capabilities, supports flexible configuration and custom development of each module of the RAG system. It offers a production-level RAG workflow for businesses of any scale based on Alibaba Cloud's Platform of Artificial Intelligence (PAI).

🎬 PAI-RAG with Web Search Demo (local client using Cherry Studio)

RAGDemo_output.mov

🌟 Key Features

  • Modular design, flexible and configurable
  • Powerful RAG capability: multi-modal rag, agentic-rag and nl2sql support
  • Built on community open source components, low customization threshold
  • Multi-dimensional automatic evaluation system, easy to grasp the performance quality of each module
  • Integrated llm-based-application tracing and evaluation visualization tools
  • Interactive UI/API calls, convenient iterative tuning experience
  • Alibaba Cloud fast scenario deployment/image custom deployment/open source private deployment

🔎 Get Started

You can run PAI-RAG locally using either a Docker environment or directly from the source code.

Run with Docker

  1. Set up the environmental variables.

    git clone git@github.com:aigc-apps/PAI-RAG.git
    cd PAI-RAG/docker
    cp .env.example .env

    Edit .env file if you are using dashscope api or oss store. See .env.example for more details. Note you can also configure these settings from our console ui, but it's more safe to configure from environmental variables.

  2. Start the Docker containers with the following command

    docker compose up -d
  3. Open your web browser and navigate to http://localhost:8680 to verify that the service is running. The service will need to download the model weights, which may take around 20 minutes.

Run in a Local Environment

If you prefer to run or develop PAI-RAG locally, please refer to local development guide

Simple Query Using the Web UI

  1. Open http://localhost:8680 in your web browser. Adjust the index and LLM settings to your preferred models

  1. Go to the "Upload" tab and upload the test data: ./example_data/paul_graham/paul_graham_essay.txt.

  1. Once the upload is complete, switch to the "Chat" tab.

Simple Query Using the RAG API

  1. Open http://localhost:8680 in your web browser. Adjust the index and LLM settings to your preferred models

  2. Upload data via API: Go to the PAI-RAG base directory

    cd PAI-RAG

    Request

    curl -X 'POST' http://localhost:8680/api/v1/knowledgebases/{knowledgebase_name}/files \
       -H 'Content-Type: multipart/form-data' \
       -F 'files=@example_data/paul_graham/paul_graham_essay.txt'

    Response

    {
      "message": "Files have been successfully uploaded."
    }

    Note: file is uploaded to RAG service and background job will pick up the file and index it to the vector store.

  3. Check the status of the upload job:

    Request

    curl -X 'GET' http://localhost:8680/api/v1/knowledgebases/{knowledgebase_name}/history \

    Response

    [
      {
        "task_id": "93d3782ccd4b33afdc1b6a1f0ce18e3a",
        "operation": "ADD",
        "file_name": "localdata/knowledgebase/default/docs/paul_graham_essay.txt",
        "status": "done",
        "last_modified_time": "2025-03-28 15:47:07"
      }
    ]
  4. Perform a RAG query (OpenAI-compatible):

    Request

    curl -X 'POST' http://localhost:8680/v1/chat/completions \
       -H "Content-Type: application/json" \
       -d '{
       "model": "default",
       "messages": [
          {"role": "user", "content": "杭州在中国哪个省?"}
       ],
       "stream":false,
    }'

    Response

    {
      "id": "7aac074feef14c31a322b15bb1c4c452",
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "杭州位于中国的**浙江省**。它是浙江省的省会城市,也是中国著名的历史文化名城和旅游胜地,以西湖、龙井茶等闻名于世。",
            "refusal": null,
            "role": "assistant",
            "audio": null,
            "function_call": null,
            "tool_calls": null
          }
        }
      ],
      "created": 1743148661,
      "model": "DeepSeek-V3",
      "object": "chat.completion",
      "service_tier": null,
      "system_fingerprint": null,
      "usage": {
        "completion_tokens": 46,
        "prompt_tokens": 2114,
        "total_tokens": 2160,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "citation_details": [],
      "citations": []
    }

📜 Documents

API specification

You can access and integrate our RAG service according to our API specification.

MultiModal RAG

You can use multimodal RAG to process documents with images, please refer to the documentation: MultiModal RAG

Agentic RAG

You can use agent with function calling api-tools in PAI-RAG, please refer to the documentation: Agentic RAG

Data Analysis

You can use data analysis based on database or sheet file in PAI-RAG, please refer to the documentation: Data Analysis

Supported File Types

文件类型 文件格式
Unstructured .txt, .docx, .pdf, .html,.pptx,.md
Images .gif, .jpg,.png,.jpeg, .webp
Structured .csv,.xls, .xlsx,.jsonl
Others .epub,.mbox,.ipynb
  1. .doc files need to be converted to .docx files.
  2. .ppt and .pptm files need to be converted to .pptx files.