AI Cloud Drive

Document management platform with hybrid retrieval, PDF parsing, and guardrail based validation.

Tech Stack

FastAPI Celery Groq (Llama 3) ChromaDB

Links

GitHub

Project Overview

The AI Cloud Drive is a document retrieval platform combining secure file storage with hybrid retrieval pipelines. Users upload technical documents to a personal drive and query them using context-grounded retrieval and cross-encoder reranking.

Core Capabilities

Visual Tour

User Registration & Authentication

User registration interface

Features: Email verification with tokenized links, password strength validation, resend verification functionality.

Main Dashboard (File Manager)

File Manager Dashboard

Capabilities: Drag-and-drop upload, real-time status polling (Processing → Indexed), file metadata display.

RAG Chat Interface

RAG Chat Interface

Features: Multi-document filtering, inline citations with [Source 1] references, context preview cards.

Query Auditor & Guardrail Validation

Query Validation

The system validates context grounding by preventing generation when retrieved context is insufficient, logging query telemetry for auditing.

Admin Dashboard

Admin Dashboard

Admin Capabilities: User management, storage analytics, real-time audit logs, system health monitoring, chat history auditing.

System Architecture

Client [Vanilla JS] --> Nginx Proxy
  |
  +--> API [FastAPI]
        |
        +--> Auth [Google OAuth]
        +--> LLM Ops [Groq API]
        +--> Task Queue [Redis]
              |
              +--> Worker [Celery] --> OCR/Index
        |
        +--> Data Layer
              |--> PostgreSQL (Metadata)
              |--> MinIO (Object Storage)
              |--> ChromaDB (Vectors)

Component Details

3.1 Frontend (/frontend)

3.2 Backend (/backend)

3.3 RAG Architecture

The RAG engine is modular and designed for high precision:

Key Design Decisions

Security Features