Skip to content

Streamlit-based chatbot to interact with PDFs using Retrieval-Augmented Generation (RAG), FAISS, Sentence Transformers, and Mistral LLM

License

Notifications You must be signed in to change notification settings

Rakshath66/Chat-With-Your-PDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ PDF Chatbot β€” Minimal RAG App using Streamlit + FAISS + Mistral

A clean, fast PDF-based RAG chatbot built with SentenceTransformers, FAISS vector search, and OpenAI’s Mistral-7B β€” all inside a beautiful Streamlit UI.

βœ… Built with: Streamlit, FAISS, SentenceTransformers, OpenAI, PyPDF, Mistral 7B

GitHub Repo stars GitHub forks MIT License


πŸ“Έ Preview

image

πŸ“„ Live Links


🧠 Features

  • πŸ“„ Upload and chat with PDFs
  • πŸ” Finds relevant chunks using FAISS + embeddings
  • πŸ’¬ Ask any question in natural language
  • 🧬 Embeds text using MiniLM-L6-v2
  • 🧠 Answers via OpenAI’s Mistral-7B
  • 🎨 Custom chat bubble UI in Streamlit

πŸš€ Getting Started

πŸ”§ Prerequisites

  • Python 3.9+
  • Get your API key from OpenAI

πŸ–₯️ Local Installation

# 1. Clone this repo
git clone https://github.com/rakshath66/chat-with-your-pdf.git
cd chat-with-your-pdf

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Add your API key in .env
echo "OPENAI_API_KEY=your_openai_key" >> .env

# 5. Launch the app
streamlit run src/streamlit_app.py

πŸ” Environment Variables

Create a .env file (or set in Streamlit Secrets):

OPENAI_API_KEY=your_openai_key

If you're using Streamlit Cloud/ HuggingFace, paste this into Settings β†’ Secrets:

OPENAI_API_KEY = "sk-..."

πŸ’¬ Example Prompts

  • "Summarize the full document."
  • "What is the main conclusion in page 3?"
  • "List all key entities mentioned."
  • "Who is the author and when was this written?"

πŸ“ Project Structure

chat-with-your-pdf/
β”œβ”€β”€ .env
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ src/
β”‚   └── streamlit_app.py
└── images/
    └── ui.png

🀝 Contributing

We welcome contributions! Here's how you can help:

βœ… To Contribute:

  1. Fork this repository
  2. Clone your fork: git clone https://github.com/Rakshath66/Chat-With-Your-PDF.git
  3. Create a new branch: git checkout -b feature/my-feature
  4. Make your changes and commit: git commit -m "Add: your message here"
  5. Push to your branch: git push origin feature/my-feature
  6. Open a Pull Request with a description of your changes

πŸ” Please write clean code, add docstrings if needed, and test your features!


🀝 Contributing Issues

Contributions, issues, and feature requests are welcome!

🍴 Fork this repo -> Make your changes -> Test thoroughly -> πŸ“© Submit a pull request

Please ensure your code follows best practices and includes helpful comments/documentation if needed.


πŸ“œ Code Commit Style

Follow Conventional Commits:

  • feat: new feature
  • fix: bug fix
  • docs: documentation update
  • refactor: code refactor
  • style: UI or formatting
  • chore: maintenance tasks

Example:

git commit -m "feat: added multi-pdf upload support"

πŸ§ͺ Testing

Make sure your code:

  • Doesn’t break the main app
  • Works on local Streamlit
  • Follows a consistent UI/UX style

πŸ™ Thank You

Every contribution makes this project better. Whether it's a typo fix or a new feature β€” you're appreciated!


πŸ“„ License

MIT Β© Rakshath U Shetty

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software... [rest of MIT license]

⭐ Star this project if you like it!

It helps others discover it, and motivates me to build more free AI tools. Also, feel free to open issues, request features, or contribute.


πŸ›£οΈ Next Steps β€” Roadmap


βœ… Phase 1: Basic RAG PDF Chatbot βœ… (Done)

  • βœ… Upload PDF
  • βœ… Chunk + embed with MiniLM
  • βœ… FAISS vector store
  • βœ… OpenAI + Mistral response

πŸ“ Phase 2: Multi-PDF Support

  • πŸ“š Support multiple PDFs at once
  • πŸ” Search across all PDFs in vector DB
  • 🧩 Track source chunk in response

🧠 Phase 3: Chunk Highlight + Source Tracking

  • ✨ Show which PDF chunk was used
  • πŸ”Ž Highlight sentence or paragraph
  • πŸ“Ž Add page numbers in answer

🧠 Phase 4: Simple Memory (Session-based)

  • 🧠 Let chatbot remember previous Q/A per PDF session
  • πŸ” Keep conversation thread for 1 session

βš™οΈ Phase 5: Backend API Support

  • πŸ”§ Wrap logic into FastAPI or Flask
  • πŸ” Expose /ask endpoint with PDF + query
  • πŸ› οΈ Use as an API for other frontends

🌐 Phase 6: URL + Website Reader (optional)

  • πŸ“° Add summarize_url support
  • 🌍 Upload a link β†’ extract β†’ chat like PDF

πŸ§‘β€πŸ’» Built by Rakshath U Shetty

  • Open source forever
  • Designed for learning, research, and practical use
  • Reach out via issues or PRs β€” ideas welcome!


Let me know if you want:
- `LICENSE` file (MIT version)
- A matching `.env.example` file
- `demo/screenshot.png` placeholder
- `contributing.md` file

All of this helps boost your open-source visibility!