feat(agentic): programmatic hexplan generation #192

Diplow · 2025-12-15T00:29:45Z

Move hexplan creation from AI agent to API layer. The tRPC router now creates hexplan tiles deterministically before prompt building:

Parent tiles: auto-generate step list from subtask children
Leaf tiles: create single "Execute the task" step

This eliminates the need for hexPlanInitializerPath parameter and simplifies the prompt builder to always expect an existing hexplan.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Automated hexplan generation based on tile type (parent vs leaf).
Refactor
- Tile architecture now strictly enforces leaf or parent designation.
- Status markers updated from emoji-based to readable format (Pending, In progress, Completed, Blocked).
- Simplified hexplan execution model.
Chores
- Added environment configuration option.
- Updated development practices documentation.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Move hexplan creation from AI agent to API layer. The tRPC router now creates hexplan tiles deterministically before prompt building: - Parent tiles: auto-generate step list from subtask children - Leaf tiles: create single "Execute the task" step This eliminates the need for hexPlanInitializerPath parameter and simplifies the prompt builder to always expect an existing hexplan. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add HEXFRAME_MCP_SERVER environment variable (defaults to "hexframe") and pass it through to buildPrompt, replacing hardcoded MCP tool names in execution instructions with the configurable server name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

vercel · 2025-12-15T00:29:49Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
hexframe	Ready	Preview, Comment	Dec 15, 2025 0:30am

coderabbitai · 2025-12-15T00:29:56Z

Walkthrough

The PR refactors the hexplan generation and execution model. It removes the initialization-based flow (hexPlanInitializerPath) and replaces it with automatic generation of hexplans based on tile type (parent vs. leaf). This includes updating documentation, simplifying the MCP tool input schema, adding new content generator utilities, and threading the new model through prompt construction and agentic routing.

Changes

Cohort / File(s)	Summary
Documentation updates `CLAUDE.md`, `UBIQUITOUS.md`	Updated conceptual guidance to reflect new hexplan-driven execution model: tiles are either parent (orchestration) or leaf (concrete work), never both. Replaced initialization/bootstrapping descriptions with automatic hexplan generation on hexecute. Added status markers (Pending, In progress, Completed, Blocked) and clarified dual-audience content framing.
MCP tool schema simplification `src/app/services/mcp/handlers/tools.ts`	Removed `hexPlanInitializerPath` from hexecute input schema. Updated `instruction` parameter description from initialization-focused to execution-focused. Simplified handler to pass only `instruction` to underlying hexecute call.
Environment configuration `src/env.js`	Added new `HEXFRAME_MCP_SERVER` environment variable to server schema with default value "hexframe" and corresponding runtime mapping.
Hexplan generation utilities `src/lib/domains/agentic/utils/prompt-builder.ts`	Introduced two new exported content generators: `generateParentHexplanContent()` for parent tiles and `generateLeafHexplanContent()` for leaf tiles. Made `hexPlan` required in `PromptData` interface. Removed `instruction` and `hexPlanInitializerPath` fields. Refactored `buildHexplanSection()` to accept `hasSubtasks` boolean instead of `hexPlanInitializerPath`. Added logic to return COMPLETE or BLOCKED status when no pending steps remain.
Utility exports `src/lib/domains/agentic/utils/index.ts`	Added exports for `generateParentHexplanContent` and `generateLeafHexplanContent` alongside existing `buildPrompt` and `PromptData` exports.
Test suite migration `src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts`	Migrated tests to new v5 API surface using hexplan content model with emoji-prefixed steps. Updated `PromptData` fixtures to remove `instruction` and `hexPlanInitializerPath`. Added new test suites for content generators (`generateParentHexplanContent`, `generateLeafHexplanContent`). Reorganized tests into labeled groups and expanded coverage for status rendering (COMPLETE, BLOCKED) and XML-escaping behavior. Replaced hardcoded 'hexframe' with `DEFAULT_MCP_SERVER`.
Agentic router refactoring `src/server/api/routers/agentic/agentic.ts`	Imported new content generators (`generateParentHexplanContent`, `generateLeafHexplanContent`). Added hexplan creation flow: auto-generate hexplan if missing, compute content based on tile type (parent vs. leaf), and persist via mappingService. Replaced hardcoded/context-derived `mcpServerName` with `env.HEXFRAME_MCP_SERVER`. Removed all `hexPlanInitializerPath` references. Threaded `hexPlanContent` through prompt construction. Updated streaming response to include `hexecutePrompt` for transparency.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

src/lib/domains/agentic/utils/prompt-builder.ts: Core signature changes to PromptData interface and buildHexplanSection() function; validation that hexPlan is now always required and the new content generators handle both parent and leaf tile cases correctly.
src/server/api/routers/agentic/agentic.ts: Hexplan generation and propagation logic; ensure hexPlanContent is correctly computed and threaded through all prompt-building paths; verify env.HEXFRAME_MCP_SERVER is consistently used.
Test migration accuracy: Confirm new test fixtures and content generator tests cover edge cases (missing instruction in leaf tiles, parent tile step ordering, status marker rendering).
Interface contract changes: Verify all call sites of buildHexplanSection() and buildPrompt() have been updated with the new signatures and parameter types.

Possibly related PRs

Develop #128: Restructures MCP tool handlers and input normalization in the same file (src/app/services/mcp/handlers/tools.ts); may have overlapping changes to the hexecute tool interface.
Develop #181: Both PRs overhaul the hexPlan initialization model and modify hexecute/prompt-building code paths; may introduce conflicting approaches.
Develop #127: Reworks src/app/services/mcp/handlers/tools.ts into a centralized MCP tools registry framework; directly overlaps with hexecute tool changes in this PR.

Poem

A rabbit hops through the hexframe's lanes,
Where tiles now choose: parent or leaves,
No more init scripts, just generation's refrains—
Auto-planning flows, the agent believes! 🐰✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the primary change: introducing programmatic hexplan generation at the API layer instead of agent-based generation.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch develop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

UBIQUITOUS.md (1)

85-87: Docs: make Direction-0 and leaf-hexplan wording consistent with auto-generated single-step behavior.

Line 86: “(…for the parent tile)” → “(…for the tile)”

Line 120-121 vs Line 242-243: clarify leaf hexplan is auto-seeded with a single step (optionally with instruction), rather than implying a fully authored multi-step plan.

Also applies to: 118-123, 240-243
src/server/api/routers/agentic/agentic.ts (1)
409-446: instruction is silently ignored once a hexplan exists (likely regression).
Currently instruction only affects generateLeafHexplanContent(...) during first-time hexplan creation. On subsequent calls, the instruction is dropped entirely from the prompt path.

One minimal fix: inject instruction into the <task> content for this run (without mutating the stored hexplan):
-          task: {
-            title: hexecuteContext.task.title,
-            content: hexecuteContext.task.content || undefined,
-            coords: taskCoords
-          },
+          task: {
+            title: hexecuteContext.task.title,
+            content: [
+              hexecuteContext.task.content?.trim(),
+              instruction ? `Instruction:\n${instruction}` : undefined
+            ].filter(Boolean).join('\n\n') || undefined,
+            coords: taskCoords
+          },
Apply the same change in both executeTask and hexecute.

Also applies to: 563-600

🧹 Nitpick comments (6)

src/env.js (1)
37-39: Constrain HEXFRAME_MCP_SERVER to a safe identifier charset (avoid malformed mcp__*__ tool names).
-    HEXFRAME_MCP_SERVER: z.string().default("hexframe"),
+    HEXFRAME_MCP_SERVER: z
+      .string()
+      .trim()
+      .regex(/^[a-zA-Z0-9_-]+$/, "Invalid MCP server name")
+      .default("hexframe"),
Also applies to: 80-81
src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts (2)

2-5: Avoid duplicating the MCP server default in tests (reduce drift risk).

Consider exporting a single DEFAULT_MCP_SERVER from a shared config module (or from env module) and importing it here, rather than redefining 'hexframe' in test code.

480-529: Generator tests: add an explicit “no children” expectation or guard.

generateParentHexplanContent([]) currently has an undefined/implicit contract; either add a test asserting the intended output, or make the helper throw if called with 0 children to prevent accidental use on leaf tiles.

CLAUDE.md (1)

84-91: Clarify that “Leaf” can still have context children (only “no subtask children”).
Right now “Leaf Tile: Has no subtask children (directions 1-6)” implies it, but adding an explicit sentence avoids readers interpreting “leaf” as “no children at all” (which conflicts with the negative-direction context model). As per coding guidelines / learnings, keep the subtask-vs-context distinction crisp.

src/server/api/routers/agentic/agentic.ts (1)

422-465: Extract a shared “ensureHexplanTile(...)” helper to avoid drift between endpoints.
The “create hexplan if missing” + “choose parent vs leaf” logic is duplicated; a small shared helper (module-local) would reduce future divergence (e.g., when you tweak status token rules or add idempotency handling).

Also applies to: 576-623
src/lib/domains/agentic/utils/prompt-builder.ts (1)
139-200: Make step detection robust (avoid includes('📋') / includes('🔴')).
includes(...) can be tripped by arbitrary user text (e.g., an instruction containing 📋), which can incorrectly force the “pending steps” branch forever. Prefer line-based detection:
-  const hasPendingSteps = hexPlan.includes('📋')
-  const hasBlockedSteps = hexPlan.includes('🔴')
+  const hasPendingSteps = /^📋\s/m.test(hexPlan)
+  const hasBlockedSteps = /^🔴\s/m.test(hexPlan)
Optional: if hasBlockedSteps is true, consider surfacing BLOCKED even if pending steps exist (or explicitly instruct “resolve blockers first”), since blocked steps often gate progress.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e22cb0 and 9a2217e.

📒 Files selected for processing (8)

CLAUDE.md (2 hunks)
UBIQUITOUS.md (3 hunks)
src/app/services/mcp/handlers/tools.ts (3 hunks)
src/env.js (2 hunks)
src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts (21 hunks)
src/lib/domains/agentic/utils/index.ts (1 hunks)
src/lib/domains/agentic/utils/prompt-builder.ts (4 hunks)
src/server/api/routers/agentic/agentic.ts (6 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Always use absolute imports with ~/ prefix instead of relative imports (./ or ../) in TypeScript and JavaScript files

Files:

src/env.js
src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts
src/server/api/routers/agentic/agentic.ts
src/lib/domains/agentic/utils/index.ts
src/lib/domains/agentic/utils/prompt-builder.ts
src/app/services/mcp/handlers/tools.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Implement Domain-Driven Design patterns in TypeScript files within /src/lib/domains/ with clear boundaries between mapping, IAM, and other domains. Reference /src/lib/domains/README.md for domain implementation details
Use the hierarchical tile architecture for hexframe tiles: positive directions 1-6 for subtask children, negative directions -1 to -6 for context children, and direction 0 for hexplan (execution state and agent guidance). Tiles can have both subtask and context children simultaneously
In AI orchestration and hexplan implementation, agents must use emoji prefixes when updating hexplan tiles: 🟡 STARTED for task begun, ✅ COMPLETED for task finished, 🔴 BLOCKED for task stuck. Use standard MCP tools (getItemByCoords, updateItem) for hexplan updates
Implement Hexframe's execution philosophy: define system structure as a hierarchy with context and subtasks, run hexecute autonomously, monitor progress via hexplan tiles at direction-0, adjust by editing relevant hexplan tiles and restarting. Structure serves as the control interface, not chat history

Files:

src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts
src/server/api/routers/agentic/agentic.ts
src/lib/domains/agentic/utils/index.ts
src/lib/domains/agentic/utils/prompt-builder.ts
src/app/services/mcp/handlers/tools.ts

**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Vitest (not Jest) for all tests in TypeScript files

Files:

src/lib/domains/agentic/utils/__tests__/prompt-builder.test.ts

src/server/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Backend must use tRPC for type-safe API with server-side caching and optimizations. Reference /src/server/README.md for backend architecture details

Files:

src/server/api/routers/agentic/agentic.ts

src/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Frontend must use Next.js 15 App Router with progressive enhancement, following static → progressive → dynamic component patterns, and implement localStorage caching for performance. Reference /src/app/map/README.md for UI architecture

Files:

src/app/services/mcp/handlers/tools.ts

🧠 Learnings (6)

📓 Common learnings

Learnt from: CR
Repo: Diplow/hexframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-06T16:06:30.756Z
Learning: The hexplan initialization system is itself a Hexframe system that demonstrates self-referential application of the model: it reads task hierarchies, analyzes context and subtask structure, generates initial hexplan tiles at direction-0, and sets up execution state for autonomous runs

Learnt from: CR
Repo: Diplow/hexframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-06T16:06:30.756Z
Learning: Applies to **/*.{ts,tsx} : Implement Hexframe's execution philosophy: define system structure as a hierarchy with context and subtasks, run hexecute autonomously, monitor progress via hexplan tiles at direction-0, adjust by editing relevant hexplan tiles and restarting. Structure serves as the control interface, not chat history

Learnt from: CR
Repo: Diplow/hexframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-06T16:06:30.756Z
Learning: Applies to **/*.{ts,tsx} : Use the hierarchical tile architecture for hexframe tiles: positive directions 1-6 for subtask children, negative directions -1 to -6 for context children, and direction 0 for hexplan (execution state and agent guidance). Tiles can have both subtask and context children simultaneously