Build, configure, and chat with your own Retrieval-Augmented Generation bots.
Features · Quick Start · Architecture · Tech Stack · Security · Roadmap
- 🔑 Bring-Your-Own-Key — Use your own API keys for OpenAI, Anthropic, Google Gemini, or Mistral. Keys are encrypted at rest with AES-256-GCM. Zero LLM cost for the platform.
- 💬 Real-Time Streaming Chat — Live token-by-token responses powered by the Vercel AI SDK, with markdown rendering and source citations.
- 📂 Document Upload & Retrieval — Upload documents (TXT, MD, PDF) during RAG creation or directly mid-chat via the 📎 button. Keyword retrieval surfaces relevant context to the LLM.
- 🎛️ In-Chat Model Switcher — Switch between 20+ models from all providers on the fly from a dropdown — no need to leave the conversation.
- ⚙️ Live Bot Tuning — Adjust temperature, max tokens, top-p, and system prompt from a slide-out panel without leaving the chat.
- 🛡️ Granular Error Handling — Distinct, actionable error banners for invalid keys, exhausted credits, rate limits, model-not-found, and context overflow.
- 🏗️ Multi-Step Creation Wizard — A guided 6-step wizard: name → model → retrieval → safety → upload → review.
- 🔒 Platform API Keys — Generate
rag_-prefixed keys for programmatic access, hashed with bcrypt. Revealed once, never stored in plaintext. - 🌙 Dark/Light Themes — Full theme support via CSS custom properties. No Tailwind — pure vanilla CSS Modules.
The easiest way to run Ragify without worrying about Node.js versions, native build tools, or installing Ollama is using Docker. This will spin up both the Ragify application and a local Ollama instance automatically.
- Clone the repository:
git clone https://github.com/bhoomik-codes/ragify.git cd ragify - Configure Environment:
Create your
.envfile. You will need to generate secure keys forAUTH_SECRETandENCRYPTION_KEY(see the manual setup step 2 below for instructions on generating these keys).cp .env.example .env
- Start with Docker Compose:
Open http://localhost:3000 when the build finishes.
docker compose up --build -d
| Requirement | Version |
|---|---|
| Node.js | ≥ 20.x |
| npm | ≥ 10.x (ships with Node 20) |
git clone https://github.com/bhoomik-codes/ragify.git
cd ragify
npm installRun the included setup script. This will automatically generate your secure .env cryptographic keys, initialize the SQLite database, generate the Prisma client, and verify your local Ollama connection.
npm run setupnpm run devOpen http://localhost:3000, register an account, and create your first RAG.
Don't have LLM API keys yet? No problem — set MOCK_MODE="true" in .env to run with simulated responses. This lets you explore the full UI, upload documents, and test the creation wizard without spending a cent.
Once running, navigate to Settings → Provider Keys in the app. Add your API key for any provider (OpenAI, Anthropic, Google, Mistral). Keys are encrypted with AES-256-GCM before touching the database — the raw key is never stored.
- Vector search first: if an embedding model is available, Ragify embeds the user query and retrieves the top-K most similar chunks.
- Keyword fallback: if vector retrieval is unavailable or returns no results, Ragify falls back to SQLite FTS5 keyword search.
- No-context response: if both return no matches, Ragify responds neutrally instead of injecting arbitrary chunks into the context window.
ragify/
├── app/
│ ├── (auth)/ # Login, Signup, Forgot/Reset Password
│ ├── (app)/ # Protected routes (requires session)
│ │ ├── dashboard/ # RAG cards grid
│ │ │ └── new/ # 6-step creation wizard
│ │ ├── rags/[ragId]/
│ │ │ └── chat/ # Chat UI (model switcher, params panel, upload)
│ │ └── settings/ # BYOK & Platform key management
│ ├── (marketing)/ # Public landing page
│ └── api/
│ ├── auth/ # NextAuth handlers + credential flows
│ ├── rags/ # CRUD, streaming chat, document upload
│ │ └── [id]/
│ │ ├── chat/ # POST — streaming chat endpoint
│ │ └── documents/ # POST — in-chat file upload
│ └── users/me/
│ ├── provider-keys/ # BYOK key CRUD
│ └── platform-keys/ # Platform API key CRUD
│
├── components/
│ ├── layout/ # AppShell, Sidebar, TopBar, ThemeToggle
│ ├── settings/ # ProviderKeyManager, PlatformKeyManager
│ ├── shared/ # ConfirmDialog, EmptyState, OnboardingTour
│ └── ui/ # Button, Card, Input, Modal, Badge, Spinner
│
├── lib/
│ ├── auth.ts # NextAuth v5 config (credentials provider)
│ ├── crypto.ts # AES-256-GCM encrypt/decrypt + bcrypt
│ ├── llm.ts # Provider-agnostic streaming + error classification
│ ├── pipeline.ts # Document parse → chunk → embed pipeline
│ ├── vector.ts # Cosine similarity, serialize, searchChunks
│ ├── validators.ts # Zod schemas for all API payloads
│ ├── types.ts # SSoT for enums, DTOs, interfaces
│ ├── mappers.ts # Prisma row → safe DTO mapping
│ ├── db.ts # Prisma client singleton
│ └── mail.ts # Email transport (password reset)
│
├── prisma/
│ └── schema.prisma # Database schema
│
├── middleware.ts # Auth route protection
└── .env.example # Template for environment variables
| Concern | Implementation |
|---|---|
| Provider API keys | AES-256-GCM with unique IV per key |
| Platform API keys | bcrypt hashed; raw key shown exactly once |
| Route authorization | IDOR check (userId match) on every API route |
| Input validation | Zod schemas on all API payloads |
| DTO mapping | Raw Prisma objects never returned to clients |
| Error classification | LLM errors mapped to specific codes (no stack leaks) |
| Secrets | .env excluded from git; ENCRYPTION_KEY required |
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL |
✅ | file:./dev.db |
Database connection string |
AUTH_SECRET |
✅ | — | NextAuth session signing key |
AUTH_URL |
✅ | http://localhost:3000 |
App base URL for NextAuth callbacks |
ENCRYPTION_KEY |
✅ | — | 64-char hex string for AES-256-GCM |
MOCK_MODE |
❌ | "false" |
Set "true" to bypass real LLM calls |
MOCK_PIPELINE_DELAY_MS |
❌ | 500 |
Simulated pipeline delay (ms) |
OPENAI_API_KEY |
❌ | — | Platform-level OpenAI fallback key |
ANTHROPIC_API_KEY |
❌ | — | Platform-level Anthropic fallback key |
GOOGLE_API_KEY |
❌ | — | Platform-level Google fallback key |
MISTRAL_API_KEY |
❌ | — | Platform-level Mistral fallback key |
- Authentication & Dashboard
- Multi-step RAG creation wizard
- Streaming chat with Vercel AI SDK
- Document ingestion pipeline (OOM-safe batching)
- BYOK provider key management
- Platform API keys
- In-chat model switcher
- Bot parameter tuning panel
- In-chat document upload
- Granular LLM error classification
- Dashboard Bot Management (Edit/Delete)
- Chat History Sidebar & Resumption
- Real embedding API integration (Ollama & OpenAI)
- Analytics dashboard (token usage, response times)
- Public RAG sharing links
- S3/R2 object storage for documents
- Hybrid Retrieval (Vector search + FTS5 fallback)
- Internet Search Integration (Tavily/Serper)
- PostgreSQL & pgvector migration
- Dedicated Worker Queue (Redis/BullMQ)
- Memory Stabilization: Implemented batch-processing in the ingestion pipeline to prevent Heap Out of Memory (OOM) errors during large document processing.
- Improved UX: Added a "Retry" button and detailed error messaging (e.g., "Rate limited") for document uploads in the creation wizard.
- Rate Limit Optimization: Increased document upload rate limits from 10 to 50 per minute to better support large batch uploads.
- Mermaid Diagram Reliability:
- Updated system prompts with strict syntax rules for modern Mermaid (flowchart TD, quoted labels).
- Enhanced the Mermaid component to capture and display actual parser errors instead of hanging on failures.
- Infrastructure & Testing:
- Expanded test suite with comprehensive integration tests for API routes and pipeline logic.
- Ingestion reliability: Upload ingestion no longer detaches a background promise that can be killed in serverless runtimes; temp files are always cleaned up.
- No-context handling: Removed the “stuff the first 3 chunks” fallback; unrelated questions now return a neutral “no relevant information found” response.
- Upload security:
- Filename sanitization + upload-dir containment to prevent path traversal
- Strict allowlist validation (415) for:
.txt/.md/.pdf/.docx/.csv - 10MB max upload size (413) + basic per-user upload rate limit (429)
- Performance:
- Vector retrieval yields to the event loop during similarity scoring and caps embeddings per query (with pgvector migration guidance in
FUTURE_PLAN.md) - Keyword fallback upgraded to SQLite FTS5 (with raw SQL migration + safe fallback if not applied)
- Vector retrieval yields to the event loop during similarity scoring and caps embeddings per query (with pgvector migration guidance in
- Pipeline quality & resilience:
- Semantic chunking (paragraph → line → sentence → word) with overlap preservation
- Extraction failures mark the document as
FAILEDwith a human-readableerrorMessage
- Tests & maintenance:
- Added unit tests for vector utilities + chunking
- Standardized module imports and added a minimal Vitest runner (
npm test)
- Enhanced Chat Experience: Introduced a collapsible sidebar for managing conversation history, allowing users to seamlessly resume previous chats or start new ones.
- Bot Lifecycle Management: Added Edit and Delete functionality directly from the dashboard bot cards.
- Improved RAG Creation Wizard: Integrated dropdowns for model selection based on providers, and added an interactive emoji picker for bot avatars.
- Local Model Support: Added support for local Ollama models (including Qwen3 and Deepseek) alongside cloud providers.
- Secure File Uploads: Users can now upload various documents (.txt, .md, .csv, .pdf, .docx, .pptx) directly into an active chat context.
Prisma: "Cannot find module '@prisma/client'"
Run npx prisma generate to regenerate the Prisma client after any schema change.
Hydration mismatch errors
The Modal component uses a mounted state pattern to avoid server/client mismatch. If you see hydration errors after adding a new modal, ensure it returns null on the first render pass.
"Invalid payload" when saving provider keys
The validator trims whitespace automatically. If the error persists, check that the key format matches the provider's expected pattern (e.g., sk-... for OpenAI).
ENCRYPTION_KEY errors on startup
The key must be exactly 64 hex characters. Generate one with:
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"This project is licensed under the MIT License.
