-
Notifications
You must be signed in to change notification settings - Fork 33
🤖 refactor: restore separate task + bash tools #1308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7860d42 to
986955f
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
215bd13 to
db76e83
Compare
|
Fixed in db76e83 - |
|
@codex review Follow-up: fixed static-check by aligning RefreshController unit tests with the current (post-revert) RefreshController API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
mux/src/node/services/tools/task_await.ts
Lines 174 to 176 in f84415d
| try { | |
| const report = await taskService.waitForAgentReport(taskId, { | |
| timeoutMs, |
task_await now allows timeout_secs=0 (per schema/docs for non‑blocking bash), but createTaskAwaitTool forwards timeoutMs directly into taskService.waitForAgentReport. When timeout_secs is 0 and any agent task is included (explicitly or via the default “all active tasks”), waitForAgentReport asserts timeoutMs > 0 and throws, so those agent results come back as error instead of queued|running|awaiting_report. Consider treating 0 as “no wait” for agent tasks (return current status) or defaulting to the 10‑minute wait when timeout_secs is 0.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Fixed in cac38ca.
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed the truncation metadata issue for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
84a1d42 to
ee8d4c4
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
0523720 to
a5fc33c
Compare
|
@codex review Fixed pending task finalization to skip Also extended workspace activity snapshots to include the last thinking level for the VS Code extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
bbe9ddd to
853b615
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
dc653d6 to
81cd629
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
10b63d2 to
4a2ea12
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
4a2ea12 to
ec44579
Compare
Change-Id: I93d9236d1160456b3f4f3c05815b70f84e812012 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ifa45aefee1298bd27137a4d1ffc6bf95517689b8 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I002ab1584a1c9e73ebdd06a126d5bf01acdd9756 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I1a869f9aea822126fdf7098161d5f5198fe12ee2 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Previously, timeout_secs=0 was treated as 'unset' and fell back to the default 10-minute timeout for agent tasks (while bash tasks correctly supported non-blocking 0-second waits). Now timeout_secs=0 is respected for both agent and bash tasks, enabling immediate non-blocking checks when needed. Addresses Codex review feedback. Change-Id: Iad7d72a342c9ae6c86ffcade4f7bc34e846376c5 Signed-off-by: Thomas Kosiewski <tk@coder.com>
When timeout_secs=0, task_await should return the current agent task status (queued/running/awaiting_report) rather than erroring due to waitForAgentReport's timeoutMs>0 invariant. Add a regression test. Change-Id: I6114dece5254064298bfe523339b17b1579abe48 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8721464eef644e87bd4aebd60c0231b99af432e7 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ic45bb48c897b0bab9a0460908e7167f07c43bf6f Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I0405e87882b040e05611ed899c1590293d15deb1 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Iae7e5a1f29002e2ccbc5bf1e5f82c2e72c975400 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ia3295b2a2f720c18253f36f129edf853dd0b44c1 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I6a731b8665141f37c4fa4bbdf80302db201d1171 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I057eac2e131fd9e8a00281c028ae3aee6c49b3e9 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib254ceb7cadcc2632bcd9f249767c0d1004b1167 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5f986c0596dee69aa17c901ffb94322fa390fb69 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib563d7fe7a5882aed0a61f338147e43e747bb5a2 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I44d0d38c54c9136a4f2459abe21b87d8c30bb45f Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I64827f3a8e4305fbe3a46978fa49ad78523e4662 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ibb874d6bbb7d38a458024deb1ed710d58c87d975 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I836b13fa243fd999eac70ae006d47501f57e30a6 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I3129f859e0f44bf6eaed3855a6a8222ed107dc4a Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I3dc061b4728cc6cf70f680628401eb02c540d876 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5145c268728aa0476723eb1753e56e687d9e81af Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ica79dd1f0b0f528331922bb4bdeabb9da1222395 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I470711377e0d439be2724d4ed909e18ef76bd0cc Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I4fe6af0fe450b12528335bbb04310dbe37d866cf Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ib1a91a17f9d88b5b02314d1e608135887e835d90 Signed-off-by: Thomas Kosiewski <tk@coder.com>
d2bf219 to
3542154
Compare
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
This PR reverts the recent “unify bash into task_*” approach and restores the CC-style split:
task: spawn sub-agent workspaces onlybash: execute shell commands onlytask_await/task_list/task_terminate: remain the unified surface for waiting / rediscovering / terminating both agent tasks and background bash processesKey changes
taskschema + backend execution are agent-only again (nokind="bash").bash(run_in_background=true)now returnstaskId: "bash:<processId>"(in addition tobackgroundProcessId) so callers cantask_await/task_terminatewithout needing bash-specific wrappers.bashas its own tool (no moretask(kind="bash")special-casing).bashinstead oftask(kind="bash").Compatibility / behavior notes
task(kind="bash")is intentionally not supported anymore; it will be rejected by the tool schema and backend.task(kind="bash")will render as a generic tool call.Verification
make static-check📋 Implementation Plan
Plan: Revert to separate
task+bashtools while keeping unifiedtask_awaitGoal
Restore the CC-style split:
task: spawn sub-agent workspaces only (subagent_type,prompt,title,run_in_background).bash: execute shell commands only (script,timeout_secs,run_in_background,display_name).task_await: remain the single “wait/output” tool for both agent tasks and background bash processes.Recommended approach net LoC (product code only): ~-60 to -110 LoC (mostly deleting the
task(kind="bash")branch + schema fields, plus a small additive tweak tobashbackground results).Why this is needed (context)
We currently expose a unified
tasktool that accepts both agent-task args andkind="bash"args via a flattened Zod schema +superRefine(seesrc/common/utils/tools/toolDefinitions.tsandsrc/node/services/tools/task.ts). Even though it avoidsanyOf/oneOfin the provider-facing schema, it’s still an undesirable UX/architecture: bash execution should be initiated viabash, whiletask_awaitprovides the unified output surface.Should we prefix agent task IDs as
agent:<workspaceId>?Recommendation: don’t (for now).
task_awaitneeds a cheap routing signal.taskId: "bash:<processId>"tobashbackground results gets us the consistency win (bothtaskandbashreturn ataskIdusable bytask_await) without changing agent ID semantics.If we ever want stronger typing, prefer adding an explicit
kind: "agent" | "bash"field totask_listresults (low-risk) rather than changing the agent task ID format.Implementation Steps
1) Split tool input surface in
TOOL_DEFINITIONS(agent-onlytask)Files:
src/common/utils/tools/toolDefinitions.tssrc/common/utils/tools/tools.tsActions:
taskagent-only again:TaskToolArgsSchema = TaskToolAgentArgsSchema.TaskToolBashArgsSchemaand thesuperRefineunion enforcement wrapper.TOOL_DEFINITIONS.task.description: remove “Unified … (bash)” wording and only document agent spawning.TOOL_DEFINITIONS.bash.description: update background guidance to point attask_await/task_list/task_terminate(notbash_output/bash_background_*).TaskToolResultSchemato agent-only:TaskToolCompletedResultSchema(bashResult,exitCode,note,truncated).2) Remove bash dispatch from backend
tasktoolFile:
src/node/services/tools/task.tsActions:
task(kind="bash")execution branch.createBashTool,resolveBashDisplayName,toBashTaskId,BashToolResult, andformatBashReport().3) Return
taskIdfrombashbackground results (directly awaitable)Files:
src/common/utils/tools/toolDefinitions.ts(result schema)src/node/services/tools/bash.tssrc/node/services/tools/taskId.ts(import-only)Actions:
BashToolResultSchema:taskId: z.string()to the background-success variant, wheretaskIdis the prefixed form (bash:<processId>).bash.ts:run_in_background=true), includetaskId: toBashTaskId(spawnResult.processId).taskId: toBashTaskId(processId).4) Keep
task_awaitunified (no behavior change), but fix UX affordancesFiles:
src/node/services/tools/task_await.ts(likely no code changes)src/common/utils/tools/taskToolTypeGuards.tssrc/browser/stores/WorkspaceStore.tssrc/browser/components/tools/*wheretask(kind="bash")is special-casedActions:
taskIdprefix (bash:→ backgroundProcessManager; else agent task).task(kind="bash"):TaskToolArgshavingkind.unknown → LegacyTaskBashArgsguard for rendering old history.isTaskKindBashToolCall()withisBashLikeToolCall():toolName === "bash"→ bashtoolName === "task" && input.kind === "bash"→ legacy bash-in-tasktoolName === "task" && result.taskId startsWith("bash:")/bashResult→ legacy fallback5) Update token accounting buckets
Files:
src/common/utils/tokens/tokenStatsCalculator.tssrc/common/utils/tokens/tokenStatsCalculator.test.tsActions:
getConsumerInfoForToolCall():toolName === "bash"is its own consumer bucket.toolName === "task"andinput.kind === "bash", label astask (bash legacy).6) Update tests and fixtures to use
bash, nottask(kind="bash")Targeted updates (non-exhaustive but high-signal):
src/common/utils/tools/toolDefinitions.test.tstaskaccepts{ kind: "bash" … }.taskrejects bash-shaped input.src/node/services/tools/task.bash.test.tscreateTaskTool()withkind:"bash".task_await,task_list, andtask_terminatework withtaskId="bash:<processId>".tests/ipc/runtimeExecuteBash.test.tsbash(instead oftask).toolName === "bash"(noargs.kind).tests/ipc/sendMessage.context.test.ts,tests/ipc/ollama.test.tssrc/node/services/taskService.test.tstoolName:"task" input:{kind:"bash",…}withtoolName:"bash" input:{script,…}.Update any strict equality assertions for background bash results to allow / expect the new
taskIdfield.7) Verification
Run a tight set of checks after changes:
bun test src/node/services/tools/task.test.tsbun test src/node/services/tools/task.bash.test.tsbun test src/common/utils/tools/toolDefinitions.test.tsbun test src/common/utils/tokens/tokenStatsCalculator.test.tsTEST_INTEGRATION=1 bun x jest tests/ipc/runtimeExecuteBash.test.ts -t "execute simple bash commands"Generated with
mux• Model:openai:gpt-5.2• Thinking:xhigh