MiniLab

MiniLab is a multi-agent scientific research assistant that combines autonomous analysis capabilities with collaborative agent workflows. Inspired by CellVoyager for autonomous biological data analysis, VirtualLab for multi-agent scientific collaboration, and modern agentic coding paradigms, MiniLab provides an integrated environment for conducting state-of-the-art research workflows.

Overview

MiniLab creates a team of specialized AI agents that work together to assist researchers with:

Literature Synthesis: Comprehensive searches across PubMed and arXiv with critical assessment
Data Exploration: Autonomous exploration and characterization of datasets
Hypothesis Generation: Multi-agent deliberation to develop and refine research questions
Analysis Execution: End-to-end implementation from planning through statistical validation
Documentation: Automated report generation with proper citations and figure legends

The system employs a ReAct-style execution loop where agents autonomously use tools, consult colleagues, and iterate toward solutions—while maintaining human oversight at key decision points.

Key Features

Multi-Agent Architecture

Nine specialized agents organized into three guilds (Synthesis, Theory, Implementation)
Cross-agent consultation with visible dialogue for transparency
Dynamic delegation based on task requirements and agent expertise

Intelligent Budget Management

Bayesian learning from historical token usage to improve allocation accuracy
Continuous complexity estimation (0.0-1.0 scale) replacing coarse tiers
5% graceful shutdown reserve ensuring clean completion within budget
Budget-scaled iterations adjusting agent loop limits based on remaining budget
Self-critique checkpoints verifying output quality before committing

Autonomous Execution

ReAct-style loops enabling agents to reason, act, and observe iteratively
LLM response caching (SQLite-backed) reducing redundant API calls
Checkpoint/resume capability for long-running analyses
Session orchestration managing full lifecycle from startup to teardown

Flexible Workflow System

Six composable mini-workflows that can be combined into larger pipelines
Dynamic token allocation based on workflow type and estimated complexity
Adaptive modes responding to project requirements

Security and Safety

PathGuard access control enforcing read-only data directories and sandboxed outputs
Agent-specific permissions limiting tool access by role (defined in team.yaml)
Audit logging for all file operations

User Experience

Narrative-style communication from the orchestrator (Bohr)
Visible agent consultations showing inter-agent dialogue
Graceful interruption with progress saving via Ctrl+C
Comprehensive transcripts capturing all session activity

Architecture

MiniLab/
├── agents/                    # Agent system
│   ├── base.py               # Agent with ReAct loop, self-critique, budget-scaled iterations
│   ├── prompts.py            # Structured 5-part prompting schema
│   └── registry.py           # Agent creation and colleague relationships
├── config/                    # Configuration (separated by concern)
│   ├── agents.yaml           # Agent personas, communication style, operating principles
│   ├── team.yaml             # Security: tools, write permissions, shell access
│   ├── budgets.yaml          # Token budgets per workflow type
│   ├── loader.py             # YAML configuration loader for agents.yaml
│   ├── team_loader.py        # YAML configuration loader for team.yaml
│   ├── budget_manager.py     # Dynamic budget allocation with continuous complexity
│   └── budget_history.py     # Bayesian learning from historical token usage
├── context/                   # RAG-based context management
│   ├── context_manager.py    # Context orchestration with token budgets
│   ├── embeddings.py         # Sentence-transformers integration
│   ├── vector_store.py       # FAISS vector store for retrieval
│   └── state_objects.py      # ProjectState, TaskState definitions
├── core/                      # Core infrastructure
│   ├── token_account.py      # Centralized token tracking with threshold warnings
│   ├── token_context.py      # Token context for budget awareness
│   └── project_writer.py     # Centralized output management
├── llm_backends/             # LLM integrations
│   ├── anthropic_backend.py  # Claude API with prompt caching and cache integration
│   ├── openai_backend.py     # OpenAI API support
│   ├── cache.py              # SQLite-backed LLM response caching
│   └── base.py               # Abstract backend interface
├── orchestrators/
│   ├── bohr_orchestrator.py  # Workflow coordination, session management
│   └── session_orchestrator.py # Session lifecycle management
├── security/
│   └── path_guard.py         # File access control and audit logging
├── storage/
│   ├── state_store.py        # Persistent state management
│   └── transcript.py         # Session transcript logging
├── tools/                    # Typed tool system
│   ├── base.py               # Tool, ToolInput, ToolOutput base classes
│   ├── filesystem.py         # File read/write/list operations
│   ├── code_editor.py        # Code creation and editing
│   ├── terminal.py           # Shell command execution
│   ├── environment.py        # Package management
│   ├── web_search.py         # Tavily web search integration
│   ├── pubmed.py             # NCBI E-utilities for literature
│   ├── arxiv.py              # arXiv paper search
│   ├── citation.py           # Bibliography management
│   ├── user_input.py         # User interaction tool
│   └── tool_factory.py       # Agent-specific tool instantiation
├── utils/
│   ├── __init__.py           # Console formatting, spinners
│   └── timing.py             # Performance timing utilities
└── workflows/                # Modular workflow components
    ├── base.py               # WorkflowModule abstract base class
    ├── consultation.py       # User goal clarification (Bohr)
    ├── literature_review.py  # Background research (Gould)
    ├── planning_committee.py # Multi-agent deliberation
    ├── execute_analysis.py   # Implementation loop (Dayhoff→Hinton→Bayes)
    ├── writeup_results.py    # Documentation (Gould)
    └── critical_review.py    # Quality assessment (Farber)

Agent Team

All agents use Claude Sonnet 4 via the Anthropic API with structured role-specific prompting:

Agent	Guild	Role	Specialty
Bohr	Synthesis	Project Manager	Orchestration, user interaction, workflow selection
Gould	Synthesis	Librarian Writer	Literature review, citations, scientific writing
Farber	Synthesis	Clinician Critic	Critical review, clinical relevance, quality control
Feynman	Theory	Curious Physicist	Creative problem-solving, analogies, naive questions
Shannon	Theory	Information Theorist	Experimental design, methodology, analytical rigor
Greider	Theory	Molecular Biologist	Biological mechanisms, pathway interpretation
Dayhoff	Implementation	Bioinformatician	Workflow design, data pipelines, execution planning
Hinton	Implementation	CS Engineer	Code development, debugging, script execution
Bayes	Implementation	Statistician	Statistical validation, uncertainty quantification

Installation

Prerequisites

macOS or Linux
Python 3.11 or higher
micromamba, conda, or mamba for environment management
Anthropic API key (required)
Tavily API key (optional, for web search)

Setup

# Clone repository
git clone https://github.com/denniepatton/MiniLab.git
cd MiniLab

# Create environment
micromamba env create -f environment.yml
micromamba activate minilab

# Install in development mode
pip install -e .

# Configure environment variables
cp example.env .env
# Edit .env with your API keys

# Verify installation
python -c "from MiniLab import run_minilab; print('MiniLab ready')"

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional - Web Search
TAVILY_API_KEY=tvly-...

# Optional - PubMed (higher rate limits)
NCBI_EMAIL=your@email.com
NCBI_API_KEY=...

# Optional - Timing/Debug
MINILAB_TIMING=1  # Enable timing reports

Usage

Command Line Interface

# Start a new analysis project
python scripts/minilab.py "Analyze the Pluvicto genomic data for treatment response predictors"

# Quick literature review
python scripts/minilab.py "What is the state of the art in cfDNA methylation analysis?"

# Resume an existing project
python scripts/minilab.py --resume Sandbox/pluvicto_analysis

# List existing projects
python scripts/minilab.py --list-projects

# Enable performance timing
python scripts/minilab.py --timing

Python API

import asyncio
from MiniLab import run_minilab

async def main():
    results = await run_minilab(
        request="Analyze genomic features predictive of Pluvicto response",
        project_name="pluvicto_analysis",
    )
    print(results["final_summary"])

asyncio.run(main())

Interactive Session

During execution, you can interrupt with Ctrl+C to access options:

Provide guidance - Give direction to the current workflow
Skip to next phase - Move past the current workflow step
Save and exit - Preserve progress for later resumption
Continue - Cancel the interrupt and proceed

Budget Management System

MiniLab v0.4.0 introduces an intelligent budget management system that learns from historical usage:

Bayesian Budget Learning

The system maintains a history of token usage across workflows and uses Bayesian estimation to improve future allocations:

# Budget history tracks actual vs. allocated tokens per workflow
# Over time, allocations converge to realistic requirements

Continuous Complexity Estimation

Instead of coarse "Quick/Thorough/Comprehensive" tiers, complexity is now estimated as a continuous value (0.0-1.0):

Complexity	Description	Typical Use Case
0.0-0.3	Simple	Quick questions, brainstorming
0.3-0.6	Moderate	Standard analyses, literature reviews
0.6-0.8	Complex	Multi-modal data, extensive pipelines
0.8-1.0	Very Complex	Deep research, comprehensive analyses

Budget Safeguards

5% graceful shutdown reserve: Always retains budget for clean completion
Budget-scaled iterations: Agent loop limits adjust based on remaining budget
Self-critique checkpoints: Agents verify output quality before committing
Hard enforcement: BudgetExceededError prevents runaway token usage

LLM Response Caching

SQLite-backed caching reduces redundant API calls:

24-hour TTL for cached responses
Automatic cache invalidation
Transparent integration with LLM backends

Workflows

Major Workflows

Workflow	Description	Complexity Guidance
`brainstorming`	Explore ideas and hypotheses	Low (0.2-0.4)
`literature_review`	Background research and synthesis	Moderate (0.4-0.6)
`start_project`	Full analysis pipeline	High (0.6-0.8)
`explore_dataset`	Data characterization and EDA	Moderate (0.4-0.6)

Mini-Workflow Modules

Consultation - User discussion, goal clarification, complexity estimation
Literature Review - PubMed/arXiv search with critical assessment
Planning Committee - Multi-agent deliberation on methodology
Execute Analysis - Dayhoff→Hinton→Bayes implementation loop
Write-up Results - Documentation and report generation
Critical Review - Quality assessment and recommendations

Configuration Files

MiniLab uses three main configuration files, each with a distinct purpose:

File	Purpose	Loaded By
`config/agents.yaml`	Agent personas, communication style, operating principles	`loader.py`
`config/team.yaml`	Security: tools, write permissions, shell access	`team_loader.py`
`config/budgets.yaml`	Token budgets per workflow type	`budget_manager.py`

This separation of concerns allows:

Personas to be tuned independently of security
Security policies to be enforced uniformly
Budgets to be adjusted without touching agent definitions

Security Model

MiniLab enforces strict file access control via PathGuard and team.yaml:

Directory	Access	Purpose
`ReadData/`	Read-only	Protected input data
`Sandbox/`	Read-write	Project outputs and intermediate files
Other paths	Blocked	No access outside workspace

Agent-Specific Permissions (from `team.yaml`)

Agent	Shell Access	Writable Extensions
Hinton	✓	All
Bayes	✓	`.py`, `.r`, `.R`, `.md`, `.txt`, `.json`, `.csv`
Gould	✗	`.md`, `.txt`, `.bib`, `.json`, `.yaml`, `.yml`
Others	✗	`.md`, `.txt`, `.json`

Additional protections:

Path traversal attacks are blocked
Comprehensive audit logging

Project Output Structure

All outputs are organized within Sandbox/{project_name}/:

{project_name}/
├── project_specification.md    # Goals and scope from consultation
├── data_manifest.md           # Summary of input data
├── session.json               # Session state for resumption
├── literature/
│   ├── references.md          # Bibliography
│   └── literature_summary.md  # Narrative synthesis
├── planning/
│   ├── analysis_plan.md       # Detailed analysis plan
│   └── decision_rationale.md  # Planning decisions
├── analysis/
│   ├── exploratory/          # EDA scripts and outputs
│   └── modeling/             # Statistical models
├── figures/                   # Generated visualizations
├── outputs/
│   ├── summary_report.md     # Final findings
│   └── tables/               # Result tables
├── checkpoints/              # Workflow state for resumption
└── logs/                     # Execution logs

Development

Running Tests

# Run all tests
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=MiniLab --cov-report=html

Import Verification

from MiniLab import (
    run_minilab,
    BohrOrchestrator,
    PathGuard,
    Agent,
    WorkflowModule,
    console,
)
print("All imports successful")

Best Practices

Trust the agents - Allow the ReAct loop to iterate; avoid micromanaging
Prepare your data - Ensure data files exist in ReadData/ before starting
Use descriptive project names - Facilitates organization and resumption
Start with exploration - Use brainstorming or literature_review to understand scope
Review transcripts - Stored in Transcripts/ for debugging and auditing
Let budgets adapt - The Bayesian system improves with use; trust its estimates

Limitations

Agents may produce hallucinations if not properly grounded with tool use
Long-running computations may require timeout adjustments
API costs accumulate with complex, multi-phase analyses
Requires active API keys for full functionality
Currently optimized for biomedical and computational biology research

Data Security Notice

MiniLab sends data to external APIs (Anthropic, Tavily, NCBI). Users should not process protected health information (PHI) without:

Institutional Review Board (IRB) approval
Business Associate Agreement (BAA) with API providers
Appropriate de-identification procedures

License

MIT License - see LICENSE for details.

Acknowledgments

MiniLab is inspired by and builds upon ideas from:

CellVoyager - Autonomous biological data analysis
VirtualLab - Multi-agent scientific collaboration
Modern agentic coding assistants and ReAct-style agent architectures

Changelog

Version 0.4.0 (December 2025)

Bayesian Budget Learning: Historical token usage now informs future allocations via BudgetHistory
Continuous Complexity Estimation: Replaced coarse Quick/Thorough/Comprehensive tiers with 0.0-1.0 scale
LLM Response Caching: SQLite-backed cache with 24h TTL reduces redundant API calls
Session Orchestrator: New SessionOrchestrator manages full session lifecycle
Self-Critique Checkpoints: Agents verify output quality before committing (CellVoyager pattern)
Budget-Scaled Iterations: Agent loop limits adapt based on remaining budget (VS Code pattern)
5% Graceful Shutdown Reserve: Ensures clean completion within budget
Codebase Cleanup: Removed unused runtime/ and evaluation/ modules

Version 0.3.3 (December 2025)

Minor bug fixes and stability improvements

Version 0.3.2 (December 2025)

Intelligent Budget Allocation: Bohr reserves 10% buffer for graceful completion, never exceeds budget
Contextual Autonomy: Natural language user preferences flow through to agent tools (no hardcoded levels)
Budget Typo Handling: Fixes common typos like "200l" → "200k", warns on ambiguous input
Hard Budget Enforcement: BudgetExceededError exception and agent ReAct loop budget checks
User Preference Propagation: Consultation captures "best judgment"/"without consulting" preferences
Auto-proceed in Autonomous Mode: user_input tool respects user's autonomy preferences
Graceful Completion: Always finishes cleanly, skips to writeup when budget is low

Version 0.3.1 (December 2025)

TokenAccount: Real-time token budget tracking with warnings at 60/80/95% thresholds
ProjectWriter: Centralized output management preventing duplicate files
Complete Transcript System: Full lab notebook capturing all agent conversations, reasoning, and tool use
Date Injection: Current session date injected into all agent prompts (fixes date hallucination)
Conditional data_manifest.md: Only created when data files are present
Single session_summary.md: Prevented duplicate file creation by agents
Output Guidelines: Agents instructed not to create redundant files (executive_summary.md, etc.)
Budget warnings displayed to agents as they approach token limits

Version 0.3.0 (December 2025)

Redesigned token budget system with Quick/Thorough/Comprehensive tiers and custom input
Narrative-style orchestrator communication
Visible cross-agent consultations
Tiered literature review (Quick 3-step vs. Comprehensive 7-step)
Immediate graceful exit with agent interruption propagation
Consolidated output file structure (single living documents)
Enhanced transcript system as single source of truth
Agent signature guidelines ("MiniLab Agent [Name]")
Timestamp utilities to prevent date hallucination
Post-consultation summary showing confirmed scope and budget

Version 0.2.0 (December 2025)

Complete architecture refactor
PathGuard security system
Structured 5-part agent prompting
RAG context management with FAISS
Modular workflow system
Tavily web search integration
PubMed and arXiv literature tools
Bohr orchestrator for workflow coordination
Console utilities for styled output
Prompt caching for cost reduction

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
MiniLab		MiniLab
ReadData		ReadData
Sandbox		Sandbox
examples		examples
scripts		scripts
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
MiniLabGoals.docx		MiniLabGoals.docx
README.md		README.md
environment.yml		environment.yml
example.env		example.env
pyproject.toml		pyproject.toml

License

denniepatton/MiniLab

Folders and files

Latest commit

History

Repository files navigation

MiniLab

Overview

Key Features

Multi-Agent Architecture

Intelligent Budget Management

Autonomous Execution

Flexible Workflow System

Security and Safety

User Experience

Architecture

Agent Team

Installation

Prerequisites

Setup

Environment Variables

Usage

Command Line Interface

Python API

Interactive Session

Budget Management System

Bayesian Budget Learning

Continuous Complexity Estimation

Budget Safeguards

LLM Response Caching

Workflows

Major Workflows

Mini-Workflow Modules

Configuration Files

Security Model

Agent-Specific Permissions (from team.yaml)

Project Output Structure

Development

Running Tests

Import Verification

Best Practices

Limitations

Data Security Notice

License

Acknowledgments

Changelog

Version 0.4.0 (December 2025)

Version 0.3.3 (December 2025)

Version 0.3.2 (December 2025)

Version 0.3.1 (December 2025)

Version 0.3.0 (December 2025)

Version 0.2.0 (December 2025)

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Agent-Specific Permissions (from `team.yaml`)

Packages