[CRITICAL] Add integration tests for monitoring system

## Testing Gap
No integration tests exist for the complete monitoring pipeline, creating risk of system-level failures in production.

## Components Needing Integration Testing

### 1. End-to-End Tool Tracking
- Agent execution → Hook triggers → Database writes → Metrics export → Prometheus scraping
- Verify complete data flow accuracy

### 2. Database Consistency
- Foreign key constraints between tables
- Transaction isolation
- Concurrent access handling
- Data integrity under load

### 3. Prometheus Integration
- Metric format validation
- HTTP endpoint functionality
- Registry management
- Scraping reliability

## Test Architecture

### Integration Test Structure
```
tests/
├── integration/
│   ├── test_end_to_end_tracking.py       # Full workflow
│   ├── test_database_integration.py      # Multi-table operations
│   ├── test_prometheus_integration.py    # Metrics export
│   ├── test_concurrent_access.py         # Parallel execution
│   └── test_performance_integration.py   # Load testing
├── docker/
│   ├── test-prometheus.yml              # Test Prometheus
│   └── test-environment.yml             # Full stack
└── fixtures/
    └── integration_scenarios.py         # Test scenarios
```

### Test Scenarios

#### Scenario 1: Complete Agent Workflow
```python
def test_complete_agent_workflow():
    # 1. Start prometheus exporter
    exporter = start_test_exporter()
    
    # 2. Simulate agent execution with tool usage
    mock_agent_execution([
        ('Read', {'file_path': 'test.py'}),
        ('Edit', {'content': 'new content'}),
        ('Bash', {'command': 'pytest'})
    ])
    
    # 3. Verify database entries
    assert_database_entries([
        'agent_invocations',
        'agent_tool_uses', 
        'sessions'
    ])
    
    # 4. Verify metrics endpoint
    response = requests.get('http://localhost:9091/metrics')
    assert 'agent_tool_usage_total' in response.text
    
    # 5. Verify metric accuracy
    assert_metric_value('agent_tool_usage_total{tool_name="Read"}', 1)
```

#### Scenario 2: Concurrent Agent Execution
```python
def test_concurrent_agent_execution():
    # Simulate 5 agents running in parallel
    agents = []
    for i in range(5):
        agent = MockAgent(f'agent-{i}')
        agents.append(agent)
        agent.start()
    
    # Wait for completion
    for agent in agents:
        agent.join()
    
    # Verify data integrity
    assert_no_race_conditions()
    assert_session_consistency()
    assert_metric_accuracy()
```

#### Scenario 3: Database Load Test
```python
def test_database_load():
    # Generate 10K invocations with 100K tool uses
    generate_load_data(
        sessions=100,
        invocations_per_session=100,
        tools_per_invocation=10
    )
    
    # Start collector
    collector = MetricsCollector(test_db)
    
    # Measure performance
    start = time.time()
    collector.collect_metrics()
    duration = time.time() - start
    
    # Performance assertions
    assert duration < 1.0  # Under 1 second
    assert memory_usage() < 100 * 1024 * 1024  # Under 100MB
```

### Docker Test Environment
```yaml
# docker/test-stack.yml
version: '3.8'
services:
  prometheus:
    image: prom/prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./test-prometheus.yml:/etc/prometheus/prometheus.yml
  
  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=test
  
  agent-exporter:
    build: .
    ports:
      - "9091:9091"
    volumes:
      - ./test_data:/app/logs
    command: python tools/prometheus_exporter.py --port 9091
```

## Test Implementation

### Database Integration Tests
```python
class TestDatabaseIntegration:
    def test_foreign_key_constraints(self):
        # Insert agent_tool_uses without corresponding agent_invocation
        # Should fail with foreign key constraint
        with pytest.raises(sqlite3.IntegrityError):
            insert_tool_use_without_invocation()
    
    def test_transaction_rollback(self):
        # Start transaction, make changes, rollback
        # Verify no partial data persists
        pass
    
    def test_concurrent_writes(self):
        # Multiple threads writing simultaneously
        # Verify no deadlocks or data corruption
        pass
```

### Prometheus Integration Tests
```python
class TestPrometheusIntegration:
    def test_metrics_format_validation(self):
        # Scrape metrics endpoint
        response = requests.get('http://localhost:9091/metrics')
        
        # Parse with prometheus parser
        metrics = prometheus_parser.parse(response.text)
        
        # Validate metric names and labels
        for metric in metrics:
            assert is_valid_metric_name(metric.name)
            assert all(is_valid_label(label) for label in metric.labels)
    
    def test_metric_cardinality(self):
        # Generate high-cardinality scenario
        # Verify metrics don't exceed limits
        assert len(get_unique_label_combinations()) < 10000
    
    def test_prometheus_scraping(self):
        # Start prometheus with test config
        # Verify successful scraping
        # Check prometheus targets are healthy
        pass
```

## Performance Integration Tests
```python
def test_performance_under_load():
    # Test with realistic production data volumes
    scenarios = [
        (1000, 10000),     # 1K sessions, 10K invocations
        (5000, 50000),     # 5K sessions, 50K invocations
        (10000, 100000),   # 10K sessions, 100K invocations
    ]
    
    for sessions, invocations in scenarios:
        setup_test_data(sessions, invocations)
        
        # Measure collection performance
        collector = MetricsCollector(test_db)
        duration = time_collection(collector)
        
        # Assert performance requirements
        assert duration < (invocations / 10000)  # Linear scaling
```

## CI/CD Integration
```yaml
# .github/workflows/integration-test.yml
name: Integration Tests
on: [push, pull_request]

jobs:
  integration:
    runs-on: ubuntu-latest
    services:
      prometheus:
        image: prom/prometheus
        ports:
          - 9090:9090
    
    steps:
      - uses: actions/checkout@v3
      - name: Setup test environment
        run: docker-compose -f docker/test-stack.yml up -d
      
      - name: Wait for services
        run: |
          wget --retry-connrefused --waitretry=5 --timeout=30 \
               http://localhost:9090/api/v1/targets
      
      - name: Run integration tests
        run: |
          pytest tests/integration/ -v \
                 --tb=short \
                 --durations=10
```

## Test Data Management
```python
@pytest.fixture(scope='session')
def integration_database():
    # Create temporary database with realistic data
    db_path = create_test_database()
    populate_with_realistic_data(db_path)
    yield db_path
    cleanup_test_database(db_path)

def populate_with_realistic_data(db_path):
    # Insert representative data that matches production patterns
    # - Various agent types and execution patterns
    # - Realistic tool usage distributions
    # - Time-based data spanning multiple days
    pass
```

## Validation Criteria
- [ ] End-to-end workflow tests passing
- [ ] Database consistency verified under load
- [ ] Prometheus integration working correctly
- [ ] Concurrent access handling validated
- [ ] Performance requirements met
- [ ] Error scenarios handled gracefully
- [ ] Docker test environment stable

## Success Metrics
- **Test Coverage**: End-to-end scenarios covered
- **Performance**: 100K records processed in < 1 second
- **Reliability**: 0 failures in 1000 test runs
- **Concurrency**: No race conditions with 10 parallel agents

## Effort Estimate
**6 hours total**
- 2 hours: End-to-end workflow tests
- 2 hours: Database integration tests
- 1 hour: Prometheus integration tests
- 1 hour: Docker environment and CI/CD

## Dependencies
- Unit tests (#11) should be implemented first
- Database indexes (#4) improve test performance
- Authentication (#6) affects integration test setup

## References
- PR #1: https://github.com/emiperez95/agent-workflow/pull/1
- Testing pyramid best practices
- Prometheus testing guidelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CRITICAL] Add integration tests for monitoring system #12

Testing Gap

Components Needing Integration Testing

1. End-to-End Tool Tracking

2. Database Consistency

3. Prometheus Integration

Test Architecture

Integration Test Structure

Test Scenarios

Scenario 1: Complete Agent Workflow

Scenario 2: Concurrent Agent Execution

Scenario 3: Database Load Test

Docker Test Environment

Test Implementation

Database Integration Tests

Prometheus Integration Tests

Performance Integration Tests

CI/CD Integration

Test Data Management

Validation Criteria

Success Metrics

Effort Estimate

Dependencies

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[CRITICAL] Add integration tests for monitoring system #12

Description

Testing Gap

Components Needing Integration Testing

1. End-to-End Tool Tracking

2. Database Consistency

3. Prometheus Integration

Test Architecture

Integration Test Structure

Test Scenarios

Scenario 1: Complete Agent Workflow

Scenario 2: Concurrent Agent Execution

Scenario 3: Database Load Test

Docker Test Environment

Test Implementation

Database Integration Tests

Prometheus Integration Tests

Performance Integration Tests

CI/CD Integration

Test Data Management

Validation Criteria

Success Metrics

Effort Estimate

Dependencies

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions