BAHR (بَحْر, meaning "sea" or "meter" in Arabic) is a comprehensive platform for analyzing and understanding Arabic classical poetry through advanced NLP techniques and prosodic analysis.
✨ The platform is LIVE in production!
- 🌐 Try it now: https://frontend-production-6416.up.railway.app/
- 📚 API Documentation: https://backend-production-c17c.up.railway.app/docs
- 💚 Health Status: https://backend-production-c17c.up.railway.app/health
Production Stats:
- ✅ 98.1% meter detection accuracy
- ✅ 16 classical Arabic meters supported
- ✅ Redis caching (5-10x speedup)
- ✅ 220 passing tests, 99% coverage
- 🎼 Meter Detection - Automatic identification of Arabic poetic meters (البحور)
- 📊 Syllable Segmentation - Precise prosodic analysis using CAMeL Tools
- ✨ Rhyme Analysis - Pattern extraction and validation
- 🌐 RTL-First UI - Beautiful Arabic-first interface with Next.js 16
- 🔍 Real-time Analysis - Instant feedback on poetry structure
- 📚 Golden Dataset - 52 annotated classical verses for testing
cd src/frontend
npm install
npm run devVisit: http://localhost:3000
cd src/backend
# Install as editable package (recommended)
pip install -e .
# Or install dependencies directly
pip install -r requirements.txt
# Start server
uvicorn app.main:app --reloadVisit: http://localhost:8000/docs
- Framework: Next.js 16.0.1 with App Router
- Language: TypeScript (strict mode)
- Styling: Tailwind CSS v4
- Components: shadcn/ui (New York style)
- Fonts: Cairo (UI) + Amiri (poetry) via
next/font/google - RTL: Native
dir="rtl"support
- Framework: FastAPI 0.115+
- Language: Python 3.11+
- NLP: CAMeL Tools for Arabic processing
- Database: PostgreSQL 15+ with SQLAlchemy
- Cache: Redis 7+
- Migration: Alembic
- Containerization: Docker + Docker Compose
- CI/CD: GitHub Actions
- Deployment: Railway (backend) + Vercel (frontend)
BAHR/
├── src/
│ ├── backend/ # FastAPI backend
│ │ ├── app/ # Application code
│ │ │ ├── api/ # API routes
│ │ │ ├── core/ # Core prosody engine
│ │ │ ├── ml/ # ML models & training
│ │ │ └── db/ # Database models
│ │ ├── alembic/ # Database migrations
│ │ └── tests/ # Backend unit tests
│ └── frontend/ # Next.js 16 frontend
│ ├── src/
│ │ ├── app/ # App Router pages
│ │ └── components/ # React components
│ └── public/ # Static assets
├── data/
│ ├── raw/ # Raw ML datasets (158 JSONL files)
│ ├── processed/ # Processed datasets & golden set
│ └── interim/ # Intermediate processing files
├── docs/
│ ├── api/ # API documentation
│ ├── research/ # Research documentation
│ ├── technical/ # Technical specs
│ ├── deployment/ # Deployment guides
│ ├── refactor/ # Refactoring documentation
│ └── releases/ # Release notes
├── results/
│ ├── ml/ # ML training results
│ ├── evaluations/ # Model evaluations
│ └── diagnostics/ # Analysis outputs
├── tests/
│ └── integration/ # Integration tests
├── scripts/
│ ├── ml/ # ML training scripts
│ ├── ml_pipeline/ # ML pipeline & training
│ ├── tools/ # Development tools
│ ├── data_processing/ # Data processing scripts
│ ├── setup/ # Environment setup
│ └── refactor/ # Migration scripts
├── models/ # Trained ML models
├── infrastructure/ # Docker & deployment
└── archive/ # Historical documentation
├── phases/ # Phase reports
└── sessions/ # Session summaries
Note: Repository was refactored on November 14, 2025 for production readiness. Backward compatibility symlinks removed after successful migration. See docs/refactor/ for details.
- 🚀 Quick Start Guide - Get up and running in 5 minutes
- 📖 API Documentation - Complete API reference and guides
- 🏗️ Technical Specifications - Architecture and implementation details
- 🚢 Deployment Guide - Railway deployment instructions
- 🔧 Refactoring Docs - Repository structure and migration details
- /docs/api/ - API guides and specifications
- /docs/research/ - Research documentation and datasets
- /docs/technical/ - Technical implementation details
- /docs/deployment/ - Deployment and DevOps guides
- /docs/refactor/ - Refactoring documentation
- /archive/ - Historical documentation and phase reports
📋 November 14, 2025 Update: Repository refactored for production readiness.
See docs/refactor/Repo_Refactor_Plan.md for complete details.
🎉 PHASE 1 COMPLETE - LIVE IN PRODUCTION!
Phase: All of Phase 1 (Weeks 1-8) ✅ COMPLETE Progress: 100% of MVP - DEPLOYED TO PRODUCTION Launch Date: November 10, 2025
- Complete technical documentation (40+ files)
- Next.js 16 frontend with RTL + Arabic fonts
- Golden dataset v0.20 (52 annotated verses)
- FastAPI backend with CORS middleware
- Docker Compose configuration (PostgreSQL + Redis)
- CI/CD workflows (GitHub Actions)
- Prosody Engine Core (Week 1-2)
- Text normalization with CAMeL Tools
- Phonetic analysis (CV pattern extraction)
- Taqti3 algorithm (syllable segmentation)
- Bahr detection (4 meters: الطويل، الكامل، الرمل، الوافر)
- 98.1% accuracy on test dataset ✅ (exceeds 90% target)
- Database & Infrastructure
- Alembic migrations with 8 performance indexes
- 16 Arabic meters + 8 prosodic feet seeded
- PostgreSQL 15 running in Docker
- Testing & Quality
- 220 passing tests
- 99% code coverage
- Accuracy test suite with golden dataset
- Production Readiness (Week 0)
- Railway CLI installed
- CORS policy configured
- Database indexes documented (ADR-002)
- Railway project setup (CLI ready, need to create project)
- API endpoints implementation (Week 2)
- Frontend-Backend integration
- Production deployment to Railway + Vercel
- Authentication & user management
- Performance optimization
BAHR includes a comprehensive set of shell aliases for common development tasks. To use them:
# Add to your ~/.zshrc
source /Users/YOUR_USERNAME/Desktop/Personal/BAHR/.bahr_aliases.sh
# Reload shell
source ~/.zshrcAvailable commands:
bahr-help- Show all available commandsbahr-setup- Complete environment setupbahr-start/stop/restart- Manage Docker servicesbahr-migrate- Run database migrationsbahr-test- Run tests with coveragebahr-backend/frontend- Start development servers- Plus 30+ more utilities for navigation, testing, and database management
See the full command list by running bahr-help after sourcing the aliases file.
We welcome contributions! See CONTRIBUTING.md for guidelines.
# 1. Fork and clone
git clone https://github.com/YOUR_USERNAME/BAHR.git
cd BAHR
# 2. Create feature branch
git checkout -b feature/your-feature-name
# 3. Make changes and test
npm test # Frontend tests
pytest # Backend tests
# 4. Commit and push
git commit -m "feat: add your feature"
git push origin feature/your-feature-name
# 5. Create Pull RequestThe project includes a Golden Dataset of 42 manually annotated classical Arabic verses:
- ✅ Schema-validated JSONL format
- ✅ Prosodic annotations (meters, feet, rhymes)
- ✅ Metadata (poet, era, source)
- ✅ Quality assurance reports
See dataset/evaluation/README.md
- 🔒 JWT-based authentication
- 🛡️ OWASP Top 10 compliance
- 🔐 Secrets management via Railway/Vercel
- 🚫 Rate limiting & DDoS protection
See docs/technical/SECURITY.md
This project is licensed under the MIT License - see LICENSE file for details.
- CAMeL Tools - Arabic NLP toolkit
- shadcn/ui - Beautiful UI components
- Next.js Team - Amazing React framework
- FastAPI - High-performance Python framework
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with ❤️ for Arabic Poetry Enthusiasts
بَحْر هو منصة شاملة لتحليل وفهم الشعر العربي الكلاسيكي من خلال تقنيات معالجة اللغات الطبيعية المتقدمة والتحليل العروضي.
- 🎼 كشف البحور الشعرية - تحديد تلقائي للأوزان العروضية
- 📊 التقطيع العروضي - تحليل دقيق للمقاطع الصوتية
- ✨ تحليل القوافي - استخراج والتحقق من أنماط القافية
- 🌐 واجهة عربية أصيلة - تصميم جميل يدعم العربية بالكامل
- 🔍 تحليل فوري - ردود فعل مباشرة على بنية القصيدة
- 📚 مجموعة بيانات ذهبية - 42 بيتًا كلاسيكيًا مُشَرَّحًا
# الواجهة الأمامية
cd frontend && npm install && npm run dev
# الخلفية (قريبًا)
cd backend && pip install -r requirements.txtالمرحلة: المرحلة 0 مكتملة ✅
التقدم: 60%
- ✅ التوثيق الكامل
- ✅ الواجهة الأمامية (Next.js 16)
- ✅ مجموعة البيانات الذهبية
- 🔄 تطوير الخلفية (الأسبوع 1)
صُنع بحب ❤️ لعشاق الشعر العربي