HackRx 6.0: A Production-Grade Intelligent RAG System

This repository contains a high-performance, intelligent RAG (Retrieval-Augmented Generation) system, developed as a submission for the HackRx 6.0 competition. While the project placed 42nd out of a highly competitive field, the resulting architecture represents a robust, production-ready application that showcases advanced AI engineering and DevOps practices.

The system is designed to process complex, multi-modal documents from a URL, answer a series of questions with high accuracy and low latency, and is deployed on AWS via a fully automated CI/CD pipeline.

✨ Core Features

Multi-Modal Document Processing: Capable of intelligently parsing complex documents, including PDFs, DOCX, PPTX (with image OCR), XLSX, and even recursively scanning ZIP archives to find and process the most relevant file.
Hybrid-Cloud AI Strategy: Leverages a "best-of-breed" approach, using Amazon Titan Embeddings V2 for high-throughput, accurate semantic search and Google Gemini for its state-of-the-art reasoning and content generation capabilities.
Dynamic Three-Tier Processing Engine: The system intelligently analyzes each incoming request and routes it to the most efficient processing tier, dramatically optimizing for speed and accuracy.
Advanced RAG Techniques: Implements a sophisticated RAG pipeline for large documents, including dynamic question classification and hypothetical query generation (HyDE) to improve retrieval accuracy on complex questions.
Fully Asynchronous & Parallelized: Built on asyncio, the entire pipeline is non-blocking. Document ingestion (chunk embedding) and question answering are performed in parallel, ensuring maximum performance.
Automated CI/CD Pipeline on AWS: The application is containerized with Docker and automatically tested and deployed to AWS App Runner using GitHub Actions on every push to the main branch.

🏛️ Architectural Deep Dive

This system is more than a simple script; it's a resilient, scalable service. The two key architectural patterns are its processing engine and its model resilience strategy.

1. The Three-Tiered Processing Architecture

To ensure optimal performance, the system does not use a one-size-fits-all approach. It dynamically selects one of three tiers based on the input:

Tier 1: Agentic Path (for API tasks)
- Trigger: The system detects if a request is not a document query but a direct instruction (e.g., "go to this URL and find the token").
- Action: Bypasses the RAG pipeline entirely and uses Gemini's Tool Calling feature to directly interact with web resources and solve the task.
Tier 2: Full-Context Path (for small documents)
- Trigger: A new document is ingested and found to be under a size threshold (e.g., 50,000 characters).
- Action: Skips the expensive RAG process (chunking, embedding, vector search). Instead, it loads the entire document text into the LLM's context window for a single, comprehensive Q&A call. This is faster and more accurate for smaller files.
Tier 3: High-Accuracy RAG Pipeline (for large documents)
- Trigger: The document is too large for the context window.
- Action: Engages the full RAG pipeline, including parallelized embedding with AWS Titan, upsert to Pinecone, and parallelized question answering with Google Gemini.

2. Dual Model Strategy & Resilience

To ensure the service is robust against temporary API failures, the answer generation step is designed with a fallback mechanism:

Primary Model: All generation requests are first sent to the primary LLM (e.g., gemini-2.5-flash-lite).
Fallback Model: If the primary model returns a server-side error (like a 500), the system automatically re-tries the request using a secondary, highly reliable model (e.g., gemini-2.5-flash). This makes the application resilient to transient issues with a specific model API.

🛠️ Tech Stack

Category	Technology
Backend	FastAPI, Uvicorn
AI / ML	Google Gemini, AWS Titan Embeddings V2, Pinecone
Data Processing	Pydantic, PyPDF, Docx, Pytesseract (OCR)
DevOps	Docker, GitHub Actions, AWS App Runner, Amazon ECR

🚀 Setup and Local Development

Clone the repository:

git clone https://github.com/B4K2/HackRx6.0.git
cd HackRx6.0

Create a virtual environment and install dependencies:
- This project uses pyproject.toml for dependency management.
```
python -m venv .venv
source .venv/bin/activate
pip install .
```
Configure Environment Variables:
- Create a .env file in the project root by copying the example:
```
cp .env.example .env
```
- Fill in the required API keys and settings in the .env file.
Run the application:
```
uvicorn app.main:app --reload
```
The API will be available at http://127.0.0.1:8000.

🔄 CI/CD Pipeline

This project is configured with a complete CI/CD pipeline using GitHub Actions. The workflow is defined in .github/workflows/deploy.yml and performs the following steps on every push to the main branch:

Run Tests: Installs dependencies and runs pytest to ensure the application is healthy.
Build Docker Image: Builds a new, clean container image of the application.
Push to ECR: Tags the image with the commit SHA and pushes it to a private Amazon ECR repository.
Deploy to App Runner: Triggers a new deployment on the AWS App Runner service, updating the application to the latest version.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
app		app
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml
test.py		test.py
test_app.py		test_app.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HackRx 6.0: A Production-Grade Intelligent RAG System

✨ Core Features

🏛️ Architectural Deep Dive

1. The Three-Tiered Processing Architecture

2. Dual Model Strategy & Resilience

🛠️ Tech Stack

🚀 Setup and Local Development

🔄 CI/CD Pipeline

About

Uh oh!

Releases

Packages

Languages

B4K2/HackRx6.0

Folders and files

Latest commit

History

Repository files navigation

HackRx 6.0: A Production-Grade Intelligent RAG System

✨ Core Features

🏛️ Architectural Deep Dive

1. The Three-Tiered Processing Architecture

2. Dual Model Strategy & Resilience

🛠️ Tech Stack

🚀 Setup and Local Development

🔄 CI/CD Pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages