A collaborative platform for annotating and evaluating LLM traces with MLflow integration, discovery phases, and inter-rater reliability analysis.
For detailed documentation, see the /doc folder:
- Release Notes - Latest release information and quick start
- Build Guide - Client build instructions
- Authentication Fix - Authentication improvements
- Annotation Editing - Annotation editing features
- All Documentation - Complete documentation index
- Python 3.11+
- Node.js 18+ and npm
- Databricks workspace with:
- MLflow experiments
You need to have a Databricks profile in local setup, either DEFAULT or a specific profile name.
git clone https://github.com/databricks-solutions/project-0xfffff.git
cd project-0xfffff
./setup.shThis will
- Install Python dependencies using uv
- Install Node.js dependencies
- Set up environment configuration
To deploy the application to Databricks Apps:
./deploy.shThis will:
- Build the frontend
- Sync code to Databricks workspace
- Create and deploy the Databricks App
-
Create a virtual environment and install dependencies:
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -e .
-
Set up environment variables:
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com" export DATABRICKS_TOKEN="your-token" # Or create a .env file in the project root
-
Run the FastAPI development server in local:
uv run uvicorn server.app:app --reload --port 8000
The API will be available at
http://localhost:8000API documentation athttp://localhost:8000/docs
-
Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Python dependencies:
pip install -e . # Or for editable install with dev dependencies: pip install -e ".[dev]"
-
Set up environment variables:
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com" export DATABRICKS_TOKEN="your-token"
-
Run the FastAPI development server:
uvicorn server.app:app --reload --port 8000
The API will be available at
http://localhost:8000API documentation athttp://localhost:8000/docs
-
Navigate to client directory:
cd client -
Install Node dependencies:
npm install
-
Start the development server:
npm run dev
The UI will be available at
http://localhost:5173 -
Build for production:
npm run build
Ensure you have the Databricks CLI installed and configured:
databricks --version
databricks current-user me # Verify authenticationdatabricks apps create human-eval-workshopcd client && npm install && npm run build && cd ..This creates an optimized production build in client/build/
DATABRICKS_USERNAME=$(databricks current-user me | jq -r .userName)
databricks sync . "/Workspace/Users/$DATABRICKS_USERNAME/human-eval-workshop"Refer to the Databricks Apps deploy documentation for more info.
databricks apps deploy human-eval-workshop \
--source-code-path /Workspace/Users/$DATABRICKS_USERNAME/human-eval-workshopOnce deployed, the Databricks CLI will provide a URL to access your application.
After collecting human annotations, use process_sqllite_db_mlflow.py to sync them back to MLflow as structured feedback.
In Databricks Notebook:
# Set database path via widget
dbutils.widgets.text("input_file", "/Volumes/catalog/schema/workshop.db", "Input File Path")Standalone:
from process_sqllite_db_mlflow import process_workshop_database
# Preview what will be synced (dry run)
process_workshop_database(db_path="workshop.db", dry_run=True)
# Actually sync to MLflow
process_workshop_database(db_path="workshop.db", dry_run=False)- β Multi-metric support - Syncs multiple rubric ratings per annotation
- β User attribution - Tracks who provided each rating
- β Rating labels - Converts 1-5 scores to descriptive labels (e.g., "strongly agree")
Each annotation creates MLflow feedback entries:
mlflow.log_feedback(
trace_id="tr-abc123...",
name="accuracy", # Extracted from rubric question
value=5, # 1-5 rating
rationale="strongly agree - John Doe | Comment: ...",
source=AssessmentSource(
source_type=AssessmentSourceType.HUMAN,
source_id="john.doe@example.com"
)
)Configure facilitator accounts and security settings:
facilitators:
- email: "facilitator@email.com"
password: "xxxxxxxxxx"
name: "Workshop Facilitator"
description: "Primary workshop facilitator"
security:
default_user_password: "changeme123"
password_requirements:
min_length: 8
require_uppercase: true
require_lowercase: true
require_numbers: true
session:
token_expiry_hours: 24
refresh_token_expiry_days: 7Set these environment variables for Databricks integration:
DATABRICKS_HOST- Your Databricks workspace URLDATABRICKS_TOKEN- Personal access token or service principal token
See LICENSE.MD file for details.