A Single-Agent Customer Support RAG System (FastAPI + LangChain + Groq + Chroma + Streamlit)

✅ 1. Problem We Are Solving

Modern customer support teams deal with:

Large PDF manuals (device manuals, onboarding guides, support documentation)
Repetitive questions (“How to reset?”, “Battery issue?”, “Warranty?”)
Delays in retrieving the correct answer
High dependency on support staff expertise

Goal:
Automate customer support by allowing users to:

📄 Upload their product manuals (PDFs)
❓ Ask natural-language questions
🤖 Receive accurate, contextual answers powered by RAG (Retrieval-Augmented Generation)

This reduces support load and improves response quality.

✅ 2. High-Level Solution

We build a single-agent RAG system:

User uploads a PDF
System extracts text + chunks + embeds
Chunks stored in ChromaDB
User asks a question
System retrieves relevant chunks
Groq LLM generates the answer using retrieved context

The agent does not use multi-agent coordination —
Instead, it performs the full RAG workflow independently.

✅ 3. Architecture Overview


                   ┌──────────────────────────┐
                   │       Streamlit UI       │
                   │  - Upload PDF            │
                   │  - Ask question          │
                   └────────────┬─────────────┘
                                │
                                ▼
                       (FastAPI Backend)
          ┌─────────────────────────────────────────┐
          │ 1. PDF Loader (PyPDF)                   │
          │ 2. Text Splitter                        │
          │ 3. Embeddings (HuggingFace)             │
          │ 4. Vector Store (ChromaDB)              │
          │ 5. Retriever (LangChain)                │
          │ 6. Groq LLM (Llama models)              │
          └─────────────────────────────────────────┘
                                │
                                ▼
                   ┌──────────────────────────┐
                   │    Final Answer (RAG)    │
                   │ Retrieved Chunks + LLM   │
                   └──────────────────────────┘

✅ 4. Technical Implementation

4.1 Backend — FastAPI

Responsibilities:

Accept PDF uploads
Convert PDF → raw text
Split text into chunks
Convert chunks → embeddings
Store embeddings in Chroma
Handle user queries
Perform retrieval + LLM answer generation

Key Components:

Component	Purpose
`PyPDF2`	PDF → text extraction
`RecursiveCharacterTextSplitter`	Splits into meaningful chunks
`HuggingFaceEmbeddings`	Embedding generation
`Chroma`	Vector store for retrieval
`Groq LLM`	Generates the final answer
`LangChain Retrieval Pipeline`	Glue connecting all steps

4.2 RAG Flow (Backend)

✅ Indexing Step (after PDF upload)


PDF → Text → Chunks → Embeddings → ChromaDB

✅ Query Step (user question)


Question → Retrieve Top Chunks → LLM (Groq) → Answer

4.3 UI — Streamlit

Responsibilities:

PDF upload
Send file to backend
Question textbox
Display final LLM answer

Ideal for rapid prototyping and demos.

✅ 5. Environment Setup

Download code from : https://github.com/kkvinodkumaran/support_rag_single_agent

Create a .env file in the project root:


GROQ_API_KEY=gsk_xxx
MODEL_NAME=llama-3.1-8b-instant

Note
Get Groq API key from https://console.groq.com
Get Tavily API key (optional) from https://tavily.com

✅ 6. Running the System

6.1 Run using Docker Compose (recommended)

From the project root:


docker compose up --build

Access:

Backend API Docs: http://localhost:8000/docs
Streamlit UI: http://localhost:8501

✅ 7. Local Development (uv)

Backend


cd backend
uv sync
GROQ_API_KEY=xxx uv run uvicorn app.main:app --reload

UI


cd ui
uv sync
BACKEND_HOST=localhost BACKEND_PORT=8000 uv run streamlit run streamlit_app.py

✅ 8. Data Persistence

All indexed embeddings are stored in ChromaDB
Path: ./data
Mounted inside backend container at /app/data
Safe across restarts

✅ 9. Summary

This project demonstrates:

✅ A single-agent RAG pipeline
✅ Built with modern Python tools (LangChain, Groq, Chroma, FastAPI)
✅ Clean UI via Streamlit
✅ Packaged with uv + Docker Compose
✅ Ready for production-grade expansion

Use this template to build:

Customer support bots
Internal knowledge base assistants
Document Q&A systems
Policy & compliance assistants

The Backend Engineer’s Journal