Skip to content

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Dec 22, 2025

Unifies modes (Plan/Exec/Compact) and subagents (explore/exec presets) into user-defined Agent Definitions.

Key points:

  • Agent definitions are discovered from ~/.mux/agents/*.md and <projectRoot>/.mux/agents/*.md (project overrides global overrides built-in).
  • YAML frontmatter is parsed + validated; markdown body becomes the agent prompt (layered with mux’s prelude).
  • New unified agentAiDefaults config (with back-compat for legacy modeAiDefaults / subagentAiDefaults).
  • System message composition now supports Agent: <agentId> scoped sections in AGENTS.md (and keeps Mode: scoping working).
  • The task tool now supports agentId (with subagent_type as a legacy alias).

Docs/tests:

  • New docs for Agent Definitions and Agent-scoped instruction sections.
  • Unit coverage for agent scoping + tool policy resolution.

CI/stability fixes included:

  • Make mux api --help runnable in the bundled ESM CLI output (define require via createRequire).
  • Make default worktree srcBaseDir respect MUX_ROOT during E2E runs (special-case ~/.mux / ~/.cmux tilde expansion).
  • E2E harness hardening: per-test MUX_ROOT isolation + sanity check + updated UI helpers for the Agent selector combobox.
  • Avoid git signing hangs in smoke tests and E2E demo repos (commit.gpgSign=false).
  • Stabilize OpenAI image integration tests and Storybook play tests (narrower queries / more reliable vision model).

📋 Implementation Plan

Plan: User-defined Agents (unify Modes + Subagents)

Goal

Unify modes (Plan/Exec/Compact) and subagents (explore/exec presets) into a single, extensible concept: Agent Definitions.

Users can define new “modes” and “subagents” via Markdown files discovered from:

  • ~/.mux/agents/*.md
  • <projectRoot>/.mux/agents/*.md

Each file’s Markdown body becomes the agent’s system prompt (layered with mux’s base prelude), and its YAML frontmatter is strongly typed and parsed/validated on discovery.

Additionally, consolidate model + thinking defaults so modes and subagents share one configuration pipeline (frontmatter defaults → config overrides → workspace overrides → inheritance).


Recommended approach (v1): “Agent Definitions” registry + unified AI defaults

Net LoC estimate (product code): ~1,200–1,700

1) Define the on-disk format + schema

1.1 File layout + precedence

  • Discover *.md files (non-recursive):
    1. Project-local: <projectRoot>/.mux/agents/*.md
    2. Global: ~/.mux/agents/*.md
    3. Built-ins (shipped in code)
  • Collision rule: project-local overrides global overrides built-in by agentId.
  • agentId is the filename without extension (e.g., plan.mdagentId="plan").

1.2 YAML frontmatter schema (strongly typed)

Create AgentDefinitionFrontmatterSchema (Zod) + TS types.

Proposed schema (v1):

---
name: Plan                # required, UI label
description: Create a plan before coding

# Visibility / availability
ui:
  selectable: true         # shows in main-agent UI selector
subagent:
  runnable: false          # allowed as task subagent

# AI defaults (baseline; can be overridden by config.json)
ai:
  modelString: openai:gpt-5.2
  thinkingLevel: medium

# Tool restrictions (simple + predictable)
# - base picks the baseline tool set (plan|exec|compact)
# - tools optionally *narrow* that set using exactly one strategy
policy:
  base: plan               # plan|exec|compact (default: exec)
  tools:
    deny: ["file_edit_insert", "file_edit_replace_string"]
    # OR:
    # only: ["web_fetch", "agent_skill_read"]
---

Notes:

  • agentId is derived from the filename (strip .md); there is no id field in frontmatter.
  • policy.base keeps compatibility with existing “mode-based” behavior (tool defaults + UX expectations).
  • policy.tools is optional. If present, it must specify exactly one of:
    • deny: allow the base tool set except these tools
    • only: deny the base tool set except these tools
  • Tool names are validated against the tool registry; unknown names warn + ignore.
  • All policies are further restricted by runtime “hard-denies” (e.g., subagents can’t re-enable recursive task spawning).

1.3 Parsing rules

Mirror agent skills parsing patterns:

  • Enforce a max file size (same ballpark as skills).
  • Parse YAML frontmatter + markdown body.
  • Derive agentId from the filename (strip .md).
  • On invalid file: skip it and surface a non-fatal diagnostic (logs + optional UI warning).

2) Implement discovery + reading (Node)

Create an agentDefinitionsService analogous to agentSkillsService.

Deliverables:

  • discoverAgentDefinitions({ projectRoot }) → list of index entries (id/name/description/flags + ai defaults + policy metadata).
  • readAgentDefinition({ projectRoot, agentId }) → returns the validated frontmatter + markdown body.
  • Cache results by (projectRoot, mtime) to avoid re-reading on every message; add a cheap invalidation strategy:
    • recompute if any discovered file’s mtime changes OR if the directory listing changes.

3) Unify “modeAiDefaults” + “subagentAiDefaults” into one config model

3.1 New config field

Add a single field to global config (backed by Zod + TS types):

  • agentAiDefaults: Record<string, { modelString?: string; thinkingLevel?: ThinkingLevel }>

3.2 Back-compat migration

On config load:

  • If agentAiDefaults is missing, synthesize it from existing:
    • modeAiDefaults[plan|exec|compact]agentAiDefaults[plan|exec|compact]
    • subagentAiDefaults[type]agentAiDefaults[type]

On config save:

  • Write agentAiDefaults.
  • Either:
    • keep writing legacy keys for one release (lowest risk), or
    • stop writing legacy keys but keep reading them (medium risk).

4) Apply Agent Definitions to system prompt construction

4.1 Main agent

Update buildSystemMessage to incorporate the selected agentId:

  • Always include mux prelude.
  • Inject the agent definition markdown body as a dedicated “Agent Prompt” block.
  • Keep AGENTS.md layering (global + project-local).

4.2 Extend instruction scoping in AGENTS.md

Add support for a new scoped heading:

  • Agent: <agentId>

Rules:

  • For backward compat, keep Mode: plan|exec|compact working.
  • When building system prompt:
    • apply Agent: <agentId> sections
    • apply Mode: <policy.base> sections
    • apply Model: sections (unchanged)
    • apply Tool: sections (unchanged)

5) Make tool policies agent-driven (with subagent safety)

5.1 Policy resolution algorithm

Implement a single policy resolver:

  1. Start from policy.base:
  • plan → existing plan tool policy
  • exec → existing exec tool policy
  • compact → existing compact policy
  1. Optionally apply policy.tools (exactly one):
  • deny: remove tools from the base set
  • only: keep only these tools from the base set
  1. Apply runtime “hard-denies”:
  • If running as a subagent, forcibly deny:
    • task (no recursion)
    • propose_plan (main-agent only)
    • any other tools we consider unsafe for child agents (explicit list)

5.2 Enforce in both frontend + backend

  • Frontend: use resolved policy to hide/disable UI affordances.
  • Backend: treat frontend state as advisory; enforce server-side before tool execution.

6) Unify AI defaults + overrides resolution for agents

Target resolution order (single algorithm used everywhere):

Explicit call args (e.g. /compact -m)                 [highest]
→ Workspace override for agentId (if supported)
→ Global config override: config.agentAiDefaults[agentId]
→ AgentDefinition frontmatter defaults (ai.*)
→ Inherit from parent context (subagents only)
→ Sticky last-used workspace values (main agent only)
→ System fallback model/thinking

6.1 Workspace overrides

Generalize the existing “per-mode workspace override” to “per-agent workspace override”:

  • Workspace metadata: aiSettingsByAgentId: Record<string, AiSettings>
  • Local cache key: workspaceAiSettingsByAgentId:${workspaceId}

When user changes model/thinking while agentId=X, persist overrides under that agentId.

6.2 Replace WorkspaceModeAISync with WorkspaceAgentAISync

  • Drive “effective model/thinking” from (workspaceId, agentId).
  • Continue writing to legacy “active model/thinking” keys as a compatibility bridge until consumers are migrated.

7) UI changes

7.1 Replace ModeSelector with AgentSelector

  • New UI selector lists AgentDefinition entries where ui.selectable: true.
  • Persist the chosen agentId (global or per-project).
  • Keep an ergonomic keybind:
    • TOGGLE_MODE becomes “toggle between last two selected UI agents”.

7.2 Settings: single “Agents” defaults editor

Replace the split “Mode defaults” and “Subagent defaults” views with one:

  • Lists discovered agents, grouped:
    • UI-selectable agents
    • subagent-runnable agents
    • hidden/internal agents (optional)
  • For each agent, show a read-only “Policy” summary (to keep tool permissions understandable):
    • base policy (policy.base)
    • tool filter (deny or only)
    • effective tools preview (computed list; show both main-agent and subagent variants after hard-denies)
  • For each agent, allow configuring:
    • modelString (inherit / override)
    • thinkingLevel (inherit / override)

8) Subagents: switch from presets to agentId

8.1 Tool + API shape

Evolve the task tool schema:

  • New: task({ agentId: string, prompt: string, ... })
  • Keep accepting subagent_type as an alias for 1–2 releases (mapped to agentId).

8.2 Backend behavior

  • Validate requested agentId exists and subagent.runnable: true.
  • Build subagent system prompt from that agent definition.
  • Apply tool policy resolver with “subagent hard-denies”.
  • Apply unified AI defaults resolution (with parent inheritance when agent/config/frontmatter doesn’t specify).

9) Telemetry + timing

Current telemetry expects a small fixed mode union in some places.

  • Add agentId: string to relevant telemetry events.
  • Keep mode as the derived policy.base for backward compat and dashboards.
  • Update sessionTimingService schema so custom agentIds don’t crash timing aggregation.

10) Tests / validation

Unit tests (fast):

  • AgentDefinition parsing:
    • valid frontmatter + body
    • invalid YAML / missing fields
    • agentId derived from filename (strip .md)
  • Discovery precedence:
    • project overrides global overrides built-in
  • Tool policy merge:
    • base policy + deny list
    • base policy + only list
    • subagent hard-deny always wins
  • AI defaults resolution:
    • config override beats frontmatter
    • workspace override beats config

Integration tests (targeted):

  • System message includes the agent definition body + Agent: scoped AGENTS.md sections.
  • Task creation uses agentId and enforces hard-denies.

11) Documentation

Add/extend user docs so this feature is discoverable and predictable:

  • New docs page (e.g. docs/agents.mdx):
    • What an “Agent Definition” is (unifies modes + subagents)
    • Discovery paths + precedence (<project>/.mux/agents/*.md overrides ~/.mux/agents/*.md)
    • File format (frontmatter schema + markdown body semantics)
    • Examples for:
      • a UI-selectable agent (Plan-like)
      • a subagent-runnable definition (Explore-like)
    • Tool policy semantics:
      • policy.base
      • tools.deny vs tools.only
      • subagent hard-denies (cannot be re-enabled)
    • AI defaults resolution order (frontmatter defaults vs config overrides vs workspace overrides vs inheritance)
  • Update docs/instruction-files.mdx:
    • Document new Agent: <agentId> scoped sections
    • Clarify interaction with existing Mode: scoping (derived from policy.base)
  • Add docs navigation entry (docs.json) for the new page.

Alternatives (not recommended for v1)

Option B: Only add custom subagents (keep modes fixed)

Net LoC (product): ~400–700

  • Keep Plan/Exec modes as-is.
  • Add ~/.mux/agents/*.md only for subagent presets.
  • Continue using modeAiDefaults for modes; unify only subagent side.

Pros: much smaller surface area.
Cons: does not deliver “custom modes” and does not unify defaults/UI.

Option C: Agent Definitions replace mode-based tool policy entirely

Net LoC (product): ~2,000–3,000

  • Remove AgentMode/UIMode assumptions.
  • Tool policy becomes fully data-driven (no plan/exec base).

Pros: cleanest long-term architecture.
Cons: high risk; lots of knock-on refactors.


Generated with mux • Model: openai:gpt-5.2 • Thinking: xhigh

@ThomasK33 ThomasK33 force-pushed the modes-config-20hn branch 5 times, most recently from f917d27 to 6d948e1 Compare December 23, 2025 21:44
@ThomasK33 ThomasK33 changed the title 🤖 feat: mode-scoped AI defaults + per-mode overrides 🤖 feat: user-defined agents (unify modes + subagents) Dec 24, 2025
@ThomasK33 ThomasK33 force-pushed the modes-config-20hn branch 3 times, most recently from c2aa0c9 to e61f678 Compare December 28, 2025 19:56
Change-Id: I19d4bc5c5dd1e5b2a38a4a3e6021bf0b8543b839
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I1a5ba1d32ff0a15abae85af904d89074e36be101
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8d55fb7ca4c3173706e390846b77416f7540af59
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ic00fe3e1cd68818771ac324787461a0427fcfb05
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I9daeab5067c65855a32f44c9626b8f855072fe9d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib691f6831627e0e03ecfb26339a6bd9b4a4c310c
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I73e70a106d775fe476864725b484c74a210e0775
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5b50d8a66415580553f621b90b3b7504d779b59a
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I48698b7b296f11f250b24a9d0889b41df23362d5
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ibf3179de947d6c5cce6b9fdd8c81d55638c2c235
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I33242062387e70d5ada677c9e774bced561db1c6
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I97371ec54fbf9048540fd079ae097c1846e8133a
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I33815ac9fd94d7df809fcd39a35279947b5967bd
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id8bdbdeb79baf7f56334f17abf6124b02945259d
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I3f1b6951196c5566ea232fcb36b11da72884cba5
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I51daf107e657bec4aae5acec9df63e1073604b53
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id3b0f961c1c90a2db7f692a962037fc041191357
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ie0fe4dc9e794eb6b2b59c1ab68871ef44b2c8312
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I16452dc0aa6814ceb20e6608e143d9f1b0e34fd8
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: If9e57b0a146c41c59e0028480ae41ff0632e0216
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant