Workshop Annotation Platform

A collaborative platform for annotating and evaluating LLM traces with MLflow integration, discovery phases, and inter-rater reliability analysis.

📚 Documentation

For detailed documentation, see the /doc folder:

Release Notes - Latest release information and quick start
Build Guide - Client build instructions
Authentication Fix - Authentication improvements
Annotation Editing - Annotation editing features
All Documentation - Complete documentation index

📋 Prerequisites

Python 3.11+
Node.js 18+ and npm
Databricks workspace with:
- MLflow experiments

Setup

You need to have a Databricks profile in local setup, either DEFAULT or a specific profile name.

git clone https://github.com/databricks-solutions/project-0xfffff.git
cd project-0xfffff
./setup.sh

This will

Install Python dependencies using uv
Install Node.js dependencies
Set up environment configuration

Deploy

To deploy the application to Databricks Apps:

./deploy.sh

This will:

Build the frontend
Sync code to Databricks workspace
Create and deploy the Databricks App

🚀 Local Development

Backend Setup

Option 1: Using uv (Recommended ⚡)

Create a virtual environment and install dependencies:

uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e .

Set up environment variables:

export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-token"
# Or create a .env file in the project root

Run the FastAPI development server in local:
```
uv run uvicorn server.app:app --reload --port 8000
```
The API will be available at http://localhost:8000 API documentation at http://localhost:8000/docs

Option 2: Using pip (Traditional)

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Python dependencies:

pip install -e .
# Or for editable install with dev dependencies:
pip install -e ".[dev]"

Set up environment variables:

export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-token"

Run the FastAPI development server:
```
uvicorn server.app:app --reload --port 8000
```
The API will be available at http://localhost:8000 API documentation at http://localhost:8000/docs

Frontend Setup

Navigate to client directory:
```
cd client
```
Install Node dependencies:
```
npm install
```
Start the development server:
```
npm run dev
```
The UI will be available at http://localhost:5173
Build for production:
```
npm run build
```

🚢 Deploying to Databricks Apps Manually

0. Prerequisites

Ensure you have the Databricks CLI installed and configured:

databricks --version
databricks current-user me  # Verify authentication

1. Create a Databricks App

databricks apps create human-eval-workshop

2. Build the Frontend

cd client && npm install && npm run build && cd ..

This creates an optimized production build in client/build/

3. Sync Files to Workspace

DATABRICKS_USERNAME=$(databricks current-user me | jq -r .userName)
databricks sync . "/Workspace/Users/$DATABRICKS_USERNAME/human-eval-workshop"

Refer to the Databricks Apps deploy documentation for more info.

4. Deploy the App

databricks apps deploy human-eval-workshop \
  --source-code-path /Workspace/Users/$DATABRICKS_USERNAME/human-eval-workshop

5. Access Your App

Once deployed, the Databricks CLI will provide a URL to access your application.

🔄 Syncing Annotations to MLflow

After collecting human annotations, use process_sqllite_db_mlflow.py to sync them back to MLflow as structured feedback.

Quick Start

In Databricks Notebook:

# Set database path via widget
dbutils.widgets.text("input_file", "/Volumes/catalog/schema/workshop.db", "Input File Path")

Standalone:

from process_sqllite_db_mlflow import process_workshop_database

# Preview what will be synced (dry run)
process_workshop_database(db_path="workshop.db", dry_run=True)

# Actually sync to MLflow
process_workshop_database(db_path="workshop.db", dry_run=False)

Features

✅ Multi-metric support - Syncs multiple rubric ratings per annotation
✅ User attribution - Tracks who provided each rating
✅ Rating labels - Converts 1-5 scores to descriptive labels (e.g., "strongly agree")

Output Format

Each annotation creates MLflow feedback entries:

mlflow.log_feedback(
    trace_id="tr-abc123...",
    name="accuracy",              # Extracted from rubric question
    value=5,                      # 1-5 rating
    rationale="strongly agree - John Doe | Comment: ...",
    source=AssessmentSource(
        source_type=AssessmentSourceType.HUMAN,
        source_id="john.doe@example.com"
    )
)

⚙️ Configuration

Authentication Configuration (`config/auth.yaml`)

Configure facilitator accounts and security settings:

facilitators:
  - email: "facilitator@email.com"
    password: "xxxxxxxxxx"
    name: "Workshop Facilitator"
    description: "Primary workshop facilitator"

security:
  default_user_password: "changeme123"
  password_requirements:
    min_length: 8
    require_uppercase: true
    require_lowercase: true
    require_numbers: true
  session:
    token_expiry_hours: 24
    refresh_token_expiry_days: 7

Environment Variables

Set these environment variables for Databricks integration:

DATABRICKS_HOST - Your Databricks workspace URL
DATABRICKS_TOKEN - Personal access token or service principal token

📄 License

See LICENSE.MD file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
client		client
config		config
doc		doc
server		server
.gitignore		.gitignore
CODEOWNERS.txt		CODEOWNERS.txt
LICENSE.md		LICENSE.md
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
app.yaml		app.yaml
deploy.sh		deploy.sh
process_sqllite_db_mlflow.py		process_sqllite_db_mlflow.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Workshop Annotation Platform

📚 Documentation

📋 Prerequisites

Setup

Deploy

🚀 Local Development

Backend Setup

Option 1: Using uv (Recommended ⚡)

Option 2: Using pip (Traditional)

Frontend Setup

🚢 Deploying to Databricks Apps Manually

0. Prerequisites

1. Create a Databricks App

2. Build the Frontend

3. Sync Files to Workspace

4. Deploy the App

5. Access Your App

🔄 Syncing Annotations to MLflow

Quick Start

Features

Output Format

⚙️ Configuration

Authentication Configuration (`config/auth.yaml`)

Environment Variables

📄 License

About

Uh oh!

Releases 1

Packages

Contributors 5

Uh oh!

Languages

License

databricks-solutions/project-0xfffff

Folders and files

Latest commit

History

Repository files navigation

Workshop Annotation Platform

📚 Documentation

📋 Prerequisites

Setup

Deploy

🚀 Local Development

Backend Setup

Option 1: Using uv (Recommended ⚡)

Option 2: Using pip (Traditional)

Frontend Setup

🚢 Deploying to Databricks Apps Manually

0. Prerequisites

1. Create a Databricks App

2. Build the Frontend

3. Sync Files to Workspace

4. Deploy the App

5. Access Your App

🔄 Syncing Annotations to MLflow

Quick Start

Features

Output Format

⚙️ Configuration

Authentication Configuration (config/auth.yaml)

Environment Variables

📄 License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Uh oh!

Languages

Authentication Configuration (`config/auth.yaml`)

Packages