Skip to content

Conversation

@bokelley
Copy link
Contributor

Summary

  • Fix 'Create Docs Issue' button error (missing currentConversationData variable)
  • Add comprehensive per-message feedback UI with 5-star ratings, tags, and notes
  • Add thread-level feedback controls (sentiment, overall rating, notes)
  • Add execution diagnostics panel showing timing breakdown, cache metrics, and active rules
  • Add Claude-powered conversation diagnosis endpoint for analyzing thread quality
  • Add evaluation dashboard with aggregated feedback metrics

Changes

Bug Fix

  • Fixed ReferenceError when clicking "Create Docs Issue" button by declaring currentConversationData variable

Feedback System

  • Per-message 5-star ratings with immediate save
  • Predefined feedback tags (inaccurate, incomplete, confusing, etc.)
  • Free-form notes per message
  • Thread-level sentiment (thumbs up/down) and ratings

Execution Diagnostics

  • Timing breakdown (system prompt, LLM, tools)
  • Token usage including cache metrics
  • Active rules display
  • Tool execution plan visualization

Claude Diagnosis

  • POST /api/admin/addie/threads/:id/diagnose endpoint
  • Analyzes conversation quality and suggests improvements
  • Provides training data quality assessment

Evaluation Dashboard

  • Summary stats (total responses, ratings, latency)
  • Feedback tag distribution
  • Daily trend data
  • Low-rated threads list for review

Code Quality Fixes

  • Fixed migration to use jsonb_array_elements_text for JSONB feedback_tags
  • Added proper API key validation for diagnosis endpoint
  • Sanitized user feedback to prevent prompt injection
  • Added Claude API rate limit handling
  • Validated and clamped days parameter
  • Fixed XSS risk by escaping thread_id in onclick handlers

Test plan

  • Docker container starts successfully
  • Migrations apply without errors
  • Admin page loads at /admin-addie.html
  • Threads API returns data
  • Feedback summary API returns data
  • TypeScript compiles without errors
  • All tests pass

🤖 Generated with Claude Code

bokelley and others added 5 commits December 31, 2025 11:37
The action buttons in the thread view (Create Docs Issue, Add to
Perspectives, Suggest Rule Change) were referencing an undefined
variable `currentConversationData`. The thread view sets
`currentThreadData` but never set `currentConversationData`.

- Add `currentConversationData` variable declaration
- Set `currentConversationData = data` in viewThread()
- Update all action handlers to use `thread_id` instead of
  `conversation_id` since thread data uses that field name

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Build a feedback system for Addie threads to enable evaluation of
rule changes. This captures training data about what responses
work well and what doesn't.

Features:
- Per-message 5-star rating with auto-save on click
- Per-message feedback tags (accurate, helpful, missing_info, etc.)
- Per-message notes with expandable detail panel
- Thread-level feedback with thumbs up/down + stars
- Thread-level notes text area for overall impressions
- Display of existing feedback inline with messages
- All feedback saves to existing API endpoints

The feedback panel appears when viewing any thread and attaches
ratings to assistant messages. This creates a database of labeled
examples for evaluating Addie's performance when rules change.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Build a comprehensive training data system for evaluating Addie's
performance and testing rule changes.

## Execution Diagnostics
- Add timing breakdown fields to thread messages (system_prompt_ms,
  total_llm_ms, total_tool_ms, processing_iterations)
- Store cache performance (tokens_cache_creation, tokens_cache_read)
- Track active_rule_ids per message
- Update bolt-app to capture all execution metadata from Claude response
- Add collapsible "Execution Details" panel in thread message view

## Claude Diagnosis
- Add POST /api/admin/addie/threads/:id/diagnose endpoint
- Claude analyzes conversation and provides:
  - Response quality assessment
  - What worked well / could be improved
  - Suggested rule changes
  - Training data quality label
- "Diagnose with Claude" button in thread actions panel
- Includes admin feedback notes as context for analysis

## Evaluation Dashboard
- New "Evaluation" tab in admin UI
- Summary stats: total responses, rated, avg rating, positive/negative
- Feedback tags distribution
- Low-rated threads list for quick review
- GET /api/admin/addie/feedback/summary endpoint
- Filterable by date range (7/30/90 days)

## Database Migration (066)
- Add timing columns to addie_thread_messages
- Add active_rule_ids array column with GIN index
- Create addie_execution_analysis view for debugging
- Create addie_feedback_summary view for dashboard

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix migration error: use jsonb_array_elements_text instead of unnest for JSONB feedback_tags
- Drop and recreate views to handle column name changes
- Add Anthropic import at module level instead of dynamic import
- Validate ANTHROPIC_API_KEY before using diagnosis endpoint
- Sanitize feedback parameter to prevent prompt injection
- Add proper error handling for Claude API rate limits
- Validate and clamp days parameter (1-365 range)
- Fix XSS risk by escaping thread_id in onclick handlers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Resolve import conflict by keeping both Anthropic and getAddieBoltApp imports
- Rename migration 066 to 067 to avoid conflict with 066_organization_domains.sql

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@bokelley bokelley merged commit dc62fb5 into main Dec 31, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants