Skip to content

Conversation

@edgarpavlovsky
Copy link
Member

Summary

Fixes the terminal-bench integration test that was disabled in CI.

Root Cause

The benchmark/pyproject.toml was missing package configuration, causing the Fireteam adapter installation to fail with:

ValueError: Unable to determine which files to ship inside the wheel

Changes

  1. Fixed pyproject.toml - Added [tool.hatch.build.targets.wheel] with packages = ["adapters"]
  2. Enabled CI test - Changed from if: false to run on main and e/* branches
  3. Fixed PATH issues - Added $HOME/.local/bin to PATH for uv and terminal-bench
  4. Added timeouts - 20min job timeout, 15min step timeout to prevent hanging
  5. Better logging - Added timestamps and unbuffered Python output
  6. Local testing setup - Added .actrc for testing with act locally

Testing

  • ✅ Package installs successfully locally
  • ✅ Package installs successfully in act container
  • ✅ Adapter is importable
  • ✅ Test executes (verified with act)

Local Iteration

Developers can now test this workflow locally using:

act -j integration-tests --secret-file .secrets --reuse

This PR will verify the test runs successfully in the actual CI environment.

- Fix pyproject.toml: Add [tool.hatch.build.targets.wheel] packages specification
- Enable terminal-bench test in CI workflow with proper conditions
- Add PATH fixes for uv and terminal-bench binaries
- Add timeouts to prevent hanging (20min job, 15min step)
- Add .actrc for local GitHub Actions testing with act
- Add .secrets to .gitignore

The terminal-bench adapter package was failing to build because hatchling
didn't know which files to include. Now it correctly includes the adapters/
directory and the test can run in CI.
@edgarpavlovsky edgarpavlovsky deleted the fix-terminal-bench-ci branch November 8, 2025 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants