Skip to content

Conversation

@HardMax71
Copy link
Owner

@HardMax71 HardMax71 commented Dec 27, 2025

wait instead of curl polling, + coverage_core=sysmon

Summary by CodeRabbit

  • Chores

    • CI reorganized into parallel unit, non‑k8s integration, k8s integration and E2E workflows with image caching and separate coverage streams.
    • New CI-optimized compose config and startup sequencing to ensure test readiness and seed completion.
    • Enhanced failure diagnostics by collecting compose, service, and Kubernetes logs/artifacts.
  • Tests

    • Unit tests isolated from external services; integration tests include automatic DB/Redis cleanup and reclassified test suites.
  • Documentation

    • Updated backend coverage badge link in README.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 27, 2025

📝 Walkthrough

Walkthrough

Split CI into unit, non‑k8s integration and k8s/e2e flows with pinned image pins, parallel image cache/restore/save and Codecov per-run flags; add checked-in docker-compose.ci.yaml; require user-seed completion before cert-generator/backend; migrate Motor → PyMongo async types; move DB/Redis cleanup to integration fixtures and reclassify many tests.

Changes

Cohort / File(s) Summary
CI — Backend workflow
/.github/workflows/backend-ci.yml
Split jobs (unit / integration / e2e); pin images via env vars; parallel image pull + cache/restore/save (zstd); separate pytest runs and Codecov uploads (backend-unit, backend-integration, backend-e2e); expanded failure artifact collection.
CI — Frontend workflow
/.github/workflows/frontend-ci.yml
Remove Bake/composite action; generate kubeconfig via host substitution; use checked-in compose (docker-compose.ci.yaml) to build/start full stack; broaden failure artifacts (service logs, Kafka, kubectl events).
Removed action / Added CI compose
/.github/actions/setup-ci-compose/action.yml (deleted), docker-compose.ci.yaml (added)
Deleted yq-based composite action that generated CI compose; added CI-optimized docker-compose.ci.yaml with infra always-on, app under full profile, healthchecks, sequencing, shared_ca volume and ci-network.
Compose ordering
docker-compose.yaml
cert-generator and backend now depends_on user-seed with condition: service_completed_successfully.
Backend runtime
backend/Dockerfile
Gunicorn/uvicorn flags now use env-default substitutions (WEB_CONCURRENCY, WEB_THREADS, WEB_TIMEOUT, WEB_BACKLOG).
MongoDB client migration
backend/app/core/database_context.py, backend/app/services/*, backend/workers/*, backend/scripts/seed_users.py, backend/tests/**, backend/pyproject.toml
Replace Motor async types/usages with PyMongo asynchronous equivalents (AsyncMongoClient, AsyncDatabase, AsyncCollection, AsyncClientSession); update type aliases/constructors; remove motor dependency.
Async aggregation/iteration fixes
backend/app/db/repositories/*.py, backend/app/events/event_store.py, backend/app/dlq/manager.py
Normalize async aggregation usage: await aggregate() to get cursor, then iterate or call to_list(...); adjust change-stream watch to be awaited where needed.
Tests — fixtures & organisation
backend/tests/conftest.py, backend/tests/integration/conftest.py, backend/tests/unit/conftest.py, backend/tests/unit/.env.unit
Move DB/Redis autouse cleanup into integration conftest; add integration autouse fixture that drops non-system Mongo collections and flushes Redis pre/post test; add unit conftest that loads .env.unit and forbids live DB/Redis/app/client access.
Tests — reclassification & edits
backend/tests/integration/**, backend/tests/unit/**
Many tests reclassified from unit → integration; some unit tests removed/trimmed; AdminUtils/topic tests simplified; mappers tests reduced; schema test uses structured metadata; consolidate k8s stubs via _patch_k8s; adjust Redis SSE timeouts.
Fixtures / real services
backend/tests/fixtures/real_services.py, backend/tests/fixtures/*
Update fixtures/TestDataFactory to use new Database/DBClient aliases and AsyncMongoClient; adjust connect/close and ping semantics.
Docs / Badge
README.md
Backend Codecov badge changed to project root URL and adjusted image query (label "backend cov.").

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant GH as "GitHub Actions"
  participant Cache as "Docker image cache"
  participant Reg as "Image registry"
  participant Docker as "Docker Engine / Compose"
  participant K3s as "k3s (k8s/e2e job)"
  participant Tests as "Pytest runner"
  participant Codecov as "Codecov"
  participant Art as "Artifacts"

  Note over GH,Cache: Restore image cache (key: OS + pinned images)
  GH->>Cache: restore
  alt cache miss
    GH->>Reg: pull pinned images in parallel
    Reg-->>Cache: save compressed images (zstd)
  end

  Note over GH,Docker: Non‑k8s integration flow
  GH->>Docker: docker compose -f docker-compose.ci.yaml --profile infra up -d --wait
  Docker-->>GH: infra ready
  GH->>Docker: docker compose --profile full up -d --build --wait
  GH->>Tests: run pytest (non‑k8s integration)
  Tests-->>Codecov: upload (flag: backend-integration)
  alt tests fail
    Tests-->>Art: upload docker-compose & service logs
  end

  Note over GH,K3s: k8s/e2e flow
  GH->>K3s: setup k3s, apply manifests
  K3s-->>GH: kubeconfig ready
  GH->>Tests: run pytest (k8s/e2e)
  Tests-->>Codecov: upload (flag: backend-e2e)
  alt tests fail
    Tests-->>Art: upload kubectl events & logs
  end

  Note over GH: Unit flow
  GH->>Tests: run pytest (unit)
  Tests-->>Codecov: upload (flag: backend-unit)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through CI with a twitch and a cheer,

seeds must finish first before certs appear,
images cached snug in zstd beds so deep,
tests march in lanes while logs get tucked to sleep,
a carrot for artifacts — hop, commit, deploy, leap.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.94% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'chore: speed-up of CI tests' is vague and does not reflect the main changes in this PR. The changeset involves a major migration from Motor to PyMongo async client, comprehensive CI workflow restructuring, test reorganization (unit/integration/e2e), and Docker image caching improvements. The title only mentions CI speedup without capturing these substantial technical changes. Consider a more descriptive title such as 'chore: migrate to pymongo async and restructure CI/test workflows' or 'chore: pymongo migration with CI optimization and test reorganization' to accurately represent the scope of changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ci-speedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 26.92308% with 38 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...p/db/repositories/admin/admin_events_repository.py 0.00% 8 Missing ⚠️
backend/app/db/repositories/event_repository.py 0.00% 7 Missing ⚠️
backend/app/db/repositories/dlq_repository.py 0.00% 6 Missing ⚠️
...end/app/db/repositories/notification_repository.py 0.00% 4 Missing ⚠️
backend/app/core/database_context.py 80.00% 3 Missing ⚠️
backend/app/db/repositories/saga_repository.py 0.00% 3 Missing ⚠️
backend/app/events/event_store.py 0.00% 3 Missing ⚠️
...ackend/app/db/repositories/execution_repository.py 0.00% 1 Missing ⚠️
backend/app/dlq/manager.py 0.00% 1 Missing ⚠️
backend/app/services/coordinator/coordinator.py 50.00% 1 Missing ⚠️
... and 1 more
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Flag Coverage Δ
backend-e2e 45.06% <26.92%> (?)
backend-unit 53.84% <23.07%> (?)
frontend 56.84% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ackend/app/db/repositories/execution_repository.py 77.02% <0.00%> (+9.90%) ⬆️
backend/app/dlq/manager.py 78.46% <0.00%> (ø)
backend/app/services/coordinator/coordinator.py 41.79% <50.00%> (ø)
backend/app/services/k8s_worker/worker.py 31.87% <50.00%> (ø)
backend/app/core/database_context.py 77.18% <80.00%> (+4.57%) ⬆️
backend/app/db/repositories/saga_repository.py 55.40% <0.00%> (+8.10%) ⬆️
backend/app/events/event_store.py 72.64% <0.00%> (+6.29%) ⬆️
...end/app/db/repositories/notification_repository.py 89.85% <0.00%> (+21.47%) ⬆️
backend/app/db/repositories/dlq_repository.py 80.21% <0.00%> (+32.44%) ⬆️
backend/app/db/repositories/event_repository.py 67.40% <0.00%> (+19.84%) ⬆️
... and 1 more

... and 66 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docker-compose.ci.yaml (2)

117-124: Consider adding a healthcheck to shared-ca service.

The shared-ca service is a dependency for cert-generator with condition: service_completed_successfully. However, the sleep 1 command means it completes almost immediately. While this works, adding an explicit healthcheck or using a more deterministic completion signal would be more robust.


56-88: Consider using a specific Kafka image tag for reproducibility.

The tag bitnami/kafka:3.6 is valid and points to the latest patch in the 3.6 series. However, for consistent CI builds, use an explicit full tag like bitnami/kafka:3.6.2-debian-12-r5 to avoid potential manifest issues across different architectures or future tag changes.

   kafka:
-    image: bitnami/kafka:3.6
+    image: bitnami/kafka:3.6.2-debian-12-r5
     container_name: kafka
.github/workflows/frontend-ci.yml (1)

117-120: Consider consolidating redundant log capture.

Line 117 captures all compose logs, while lines 118-120 capture individual service logs. The individual service logs are subsets of the full log. Consider if you need both, or if filtering could be applied during analysis instead.

🔎 Alternative: Keep only full logs or add filtering
          mkdir -p logs
          docker compose -f docker-compose.ci.yaml logs > logs/docker-compose.log 2>&1
-          docker compose -f docker-compose.ci.yaml logs backend > logs/backend.log 2>&1
-          docker compose -f docker-compose.ci.yaml logs frontend > logs/frontend.log 2>&1
-          docker compose -f docker-compose.ci.yaml logs kafka > logs/kafka.log 2>&1
+          # Individual logs can be extracted from docker-compose.log if needed
          kubectl get events --sort-by='.metadata.creationTimestamp' -A > logs/k8s-events.log 2>&1 || true

Alternatively, keep individual logs if you prefer separate files for easier debugging but note the duplication.

.github/workflows/backend-ci.yml (1)

159-160: Consider adding Kafka and Schema Registry logs for k8s job failures.

The non-k8s integration job collects Kafka and Schema Registry logs on failure (lines 79-81), but the k8s-integration job only collects the general compose log. For consistency and easier debugging, consider adding service-specific logs.

🔎 Proposed enhancement
        run: |
          mkdir -p logs
          docker compose -f docker-compose.ci.yaml logs > logs/docker-compose.log 2>&1
+          docker compose -f docker-compose.ci.yaml logs kafka > logs/kafka.log 2>&1
+          docker compose -f docker-compose.ci.yaml logs schema-registry > logs/schema-registry.log 2>&1
          kubectl get events --sort-by='.metadata.creationTimestamp' -A > logs/k8s-events.log 2>&1 || true
          kubectl describe pods -A > logs/k8s-describe-pods.log 2>&1 || true
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da86acc and c86129f.

📒 Files selected for processing (4)
  • .github/actions/setup-ci-compose/action.yml
  • .github/workflows/backend-ci.yml
  • .github/workflows/frontend-ci.yml
  • docker-compose.ci.yaml
💤 Files with no reviewable changes (1)
  • .github/actions/setup-ci-compose/action.yml
🧰 Additional context used
🪛 Checkov (3.2.334)
docker-compose.ci.yaml

[medium] 160-161: Basic Auth Credentials

(CKV_SECRET_4)

.github/workflows/backend-ci.yml

[medium] 50-51: Basic Auth Credentials

(CKV_SECRET_4)

🪛 GitHub Actions: Backend CI
docker-compose.ci.yaml

[error] 1-1: Command 'docker compose -f docker-compose.ci.yaml up -d --wait --wait-timeout 120' failed: manifest for bitnami/kafka:3.6 not found (manifest unknown).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: E2E Tests
🔇 Additional comments (10)
docker-compose.ci.yaml (3)

126-147: Network configuration conflict in cert-generator.

The service declares network_mode: host (line 144) while also having a depends_on relationship with shared-ca which uses ci-network. When using network_mode: host, the container cannot communicate with containers on the custom bridge network by their service names. This may cause issues if cert-generator needs to reach services on ci-network.

Additionally, network_mode: host and networks: are mutually exclusive in Docker Compose - the service correctly omits networks: but the dependency on shared-ca (which is on ci-network) might not work as expected.

Verify that the cert-generator can successfully complete its task without needing to communicate with other services on the ci-network. The volume mount approach appears to be the intended communication method via /shared_ca.


21-39: Good CI optimizations for infrastructure services.

The configuration demonstrates thoughtful CI optimizations:

  • tmpfs for MongoDB (/data/db) eliminates disk I/O
  • Redis memory limits and disabled persistence (--save "")
  • KRaft mode Kafka with reduced heap and single-node replication factors
  • Appropriate health check intervals for CI speed

Also applies to: 56-88, 90-110


27-29: Hardcoded credentials are acceptable for CI-only configuration.

The static analysis tool flagged hardcoded credentials. These are intentionally hardcoded for a CI test environment and are not used in production. The file header and naming (docker-compose.ci.yaml) clearly indicate this is CI-specific. No action needed.

Also applies to: 160-161

.github/workflows/frontend-ci.yml (3)

84-89: Simplified kubeconfig creation approach looks good.

The sed-based approach to replace the server URL is cleaner than maintaining a separate heredoc template. The chmod 644 is appropriate for read access by Docker containers.


91-94: Appropriate use of --wait flag with generous timeout.

The --build --wait --wait-timeout 300 combination ensures services are built and healthy before proceeding. The 5-minute timeout is reasonable given the full stack build (cert-generator, backend, frontend).


113-121: Comprehensive log collection on failure.

Good addition of Kafka logs and Kubernetes events. The || true on kubectl ensures log collection doesn't fail if k8s setup was the issue.

.github/workflows/backend-ci.yml (4)

18-89: Well-structured non-k8s integration job.

The job correctly:

  • Sets up UV with caching for faster dependency installation
  • Starts only infrastructure services (no --profile full)
  • Explicitly ignores k8s tests with --ignore=tests/integration/k8s
  • Uses appropriate environment variables for local service access
  • Sets COVERAGE_CORE: sysmon for faster coverage collection per PR objectives

90-168: K8s integration job properly isolated.

Good separation of k8s-specific tests into a parallel job with:

  • Dedicated k8s setup with namespace creation
  • Appropriate KUBECONFIG and K8S_NAMESPACE environment variables
  • Distinct coverage flags (backend-k8s) for Codecov

46-55: Hardcoded test credentials are acceptable for CI.

The static analysis flagged credentials at lines 50-51. These are intentionally hardcoded for CI test environment matching the docker-compose.ci.yaml configuration. This is standard practice for CI pipelines.

Also applies to: 127-137


114-122: This concern is not valid.

The kubeconfig file is already owned by the runner user because the shell redirection (>) is processed by the runner's shell before sudo executes. This means sudo k3s kubectl config view --raw > /home/runner/.kube/config creates a runner-owned file, not a root-owned one. With chmod 600, the runner user has read and write access, which is correct. The tests can access the file without issues.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/backend-ci.yml (1)

18-24: Fix the invalid Kafka image tag in the environment variable.

The KAFKA_IMAGE environment variable uses bitnami/kafka:3.6 which doesn't exist. This needs to match the fix in docker-compose.ci.yaml.

🔎 Proposed fix
 env:
   MONGO_IMAGE: mongo:8.0
   REDIS_IMAGE: redis:7-alpine
-  KAFKA_IMAGE: bitnami/kafka:3.6
+  KAFKA_IMAGE: bitnami/kafka:3.6.2
   SCHEMA_REGISTRY_IMAGE: confluentinc/cp-schema-registry:7.5.0
🧹 Nitpick comments (2)
docker-compose.ci.yaml (1)

118-125: Consider a more restrictive chmod for shared CA directory.

chmod 777 grants full permissions to everyone. While acceptable for CI, chmod 755 would be slightly more secure while still allowing the necessary access.

🔎 Proposed fix
   shared-ca:
     image: alpine:latest
     profiles: ["full"]
     volumes:
       - shared_ca:/shared_ca
-    command: sh -c "mkdir -p /shared_ca && chmod 777 /shared_ca && sleep 1"
+    command: sh -c "mkdir -p /shared_ca && chmod 755 /shared_ca && sleep 1"
     networks:
       - ci-network
.github/workflows/backend-ci.yml (1)

145-184: Caching logic is duplicated across jobs.

The Docker image caching steps are identical in both integration and k8s-integration jobs. Consider extracting this to a reusable workflow or composite action to reduce duplication.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c86129f and 094ddbf.

📒 Files selected for processing (2)
  • .github/workflows/backend-ci.yml
  • docker-compose.ci.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 98-99: Basic Auth Credentials

(CKV_SECRET_4)

docker-compose.ci.yaml

[medium] 161-162: Basic Auth Credentials

(CKV_SECRET_4)

🪛 GitHub Actions: Backend CI
docker-compose.ci.yaml

[error] 1-1: Docker image not found: bitnami/kafka:3.6 (manifest unknown). Docker could not pull the image required by docker-compose. Command failed: 'docker compose -f docker-compose.ci.yaml up -d --wait --wait-timeout 120'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: E2E Tests
🔇 Additional comments (11)
docker-compose.ci.yaml (6)

21-39: Good CI optimization with tmpfs; credentials are acceptable for CI-only use.

The tmpfs mount for MongoDB data is a smart choice for faster CI cycles. The hardcoded credentials flagged by static analysis (CKV_SECRET_4) are acceptable here since this configuration is explicitly for CI testing and not production.


41-55: LGTM!

Redis configuration is appropriately tuned for CI with memory limits and LRU eviction policy.


91-111: LGTM!

Schema-registry is properly configured with health checks and correct dependency on Kafka.


127-148: LGTM!

The cert-generator is properly configured as a one-shot container with correct dependency ordering on shared-ca.


150-194: LGTM!

Backend service has proper health checks and dependency chain (cert-generator → infra services). The explicitly named ci-test-secret-key makes it clear this is for testing only.


195-228: LGTM!

Frontend service is correctly configured with dependency on backend health and proper TLS certificate mounts.

.github/workflows/backend-ci.yml (5)

33-72: Good caching strategy with parallel operations and zstd compression.

The Docker image caching approach is well-designed:

  • Parallel pulls/saves reduce wall-clock time
  • zstd compression balances speed vs size
  • Cache key includes all image versions for proper invalidation

86-121: LGTM!

Good use of --wait flag with timeout for infrastructure startup. The COVERAGE_CORE: sysmon setting aligns with the PR objective for faster CI tests.


123-136: LGTM!

Comprehensive log collection on failure will aid debugging CI issues.


203-231: LGTM!

The k3s setup is well-designed:

  • Disables traefik (not needed for tests)
  • Uses idempotent namespace creation (dry-run | apply)
  • Proper timeout for cluster readiness
  • Environment variables correctly configured for k8s tests

244-257: LGTM!

Good defensive logging with || true to ensure log collection doesn't fail the step, maximizing debug information availability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docker-compose.ci.yaml (1)

56-89: Kafka configuration successfully addresses the past image issue.

The switch from bitnami/kafka:3.6 to apache/kafka:3.9.0 resolves the invalid image tag issue flagged in previous reviews. The KRaft mode configuration is appropriate for CI, eliminating the Zookeeper dependency and speeding up startup.

The health check uses 30 retries (60 seconds total with 2s intervals). While this provides a safety margin for Kafka's startup, you might optionally reduce it to 20-25 retries if you find Kafka consistently starts faster in your CI environment.

.github/workflows/backend-ci.yml (1)

145-184: Consider extracting Docker caching into a reusable composite action.

The Docker image caching logic (lines 145-184) is identical to the integration job (lines 33-72). This duplication creates maintenance overhead and increases the risk of inconsistencies.

Consider creating a reusable composite action (e.g., .github/actions/cache-docker-images/action.yml) that both jobs can consume. This would centralize the caching logic and make future updates easier.

Example structure for a composite action

Create .github/actions/cache-docker-images/action.yml:

name: Cache Docker Images
description: Cache and load Docker images for CI
runs:
  using: composite
  steps:
    - name: Cache Docker images
      uses: actions/cache@v5
      id: docker-cache
      with:
        path: /tmp/docker-cache
        key: docker-${{ runner.os }}-${{ env.MONGO_IMAGE }}-${{ env.REDIS_IMAGE }}-${{ env.KAFKA_IMAGE }}-${{ env.SCHEMA_REGISTRY_IMAGE }}
    
    # ... rest of the caching logic

Then in both jobs:

- uses: ./.github/actions/cache-docker-images
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fc929b and 3613d03.

📒 Files selected for processing (2)
  • .github/workflows/backend-ci.yml
  • docker-compose.ci.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 98-99: Basic Auth Credentials

(CKV_SECRET_4)

docker-compose.ci.yaml

[medium] 161-162: Basic Auth Credentials

(CKV_SECRET_4)

🔇 Additional comments (11)
docker-compose.ci.yaml (5)

1-15: LGTM! Clear documentation of CI compose usage.

The header comments provide helpful context on the two usage modes (infra-only vs full stack) and clearly explain the key differences from the main docker-compose.yaml file.


21-39: LGTM! MongoDB optimized for CI with tmpfs.

Using tmpfs for /data/db is an excellent optimization for CI speed, avoiding disk I/O overhead. The health check configuration is appropriate for the CI environment.


117-149: LGTM! Certificate generation flow correctly sequenced.

The dependency chain ensures proper startup order: shared-cacert-generatorbackend/frontend. Using host network mode for cert-generator is appropriate for certificate propagation in CI.


150-193: LGTM! Backend service properly configured with all dependencies.

The backend service correctly depends on:

  • cert-generator (completion) for TLS setup
  • All infrastructure services (health) for runtime dependencies

The health check configuration with a 30s start period and 20 retries is appropriate given the backend's initialization requirements.


195-221: LGTM! Frontend service correctly depends on backend health.

The frontend service properly waits for backend health before starting, ensuring the API is available for E2E tests. The health check configuration is appropriate.

.github/workflows/backend-ci.yml (6)

18-23: LGTM! Image version pinning ensures cache stability.

Pinning image versions in environment variables provides consistent cache keys across CI runs, improving cache hit rates and reducing CI time.


33-72: LGTM! Efficient Docker image caching with parallel operations.

The caching strategy is well-implemented:

  • Parallel image pulls and saves with background jobs (& and wait)
  • zstd compression with multi-threading (-T0) and reasonable compression level (3)
  • Cache hit logic properly loads images in parallel

This should significantly speed up CI runs after the initial cache population.


86-110: LGTM! Test execution aligns with PR objectives.

The changes successfully implement the PR objectives:

  • ✅ Uses --wait instead of curl polling for service startup (line 88)
  • ✅ Sets COVERAGE_CORE: sysmon for improved coverage collection (line 104)
  • Correctly excludes k8s tests with --ignore=tests/integration/k8s

123-136: LGTM! Comprehensive log collection on failure.

The failure path collects relevant logs (docker-compose, kafka, schema-registry) with proper stderr redirection (2>&1). The artifact name backend-logs clearly distinguishes it from the k8s job's logs.


203-232: LGTM! k3s setup and test execution properly configured.

The k8s-integration job is well-structured:

  • k3s installation with traefik disabled (appropriate for test environment)
  • Idempotent namespace creation with --dry-run=client
  • Correct test filtering for k8s-specific tests
  • COVERAGE_CORE: sysmon set per PR objectives (line 226)
  • Separate coverage flag (backend-k8s) distinguishes k8s coverage from non-k8s

244-257: LGTM! k8s-specific log collection on failure.

The failure path appropriately collects Kubernetes events and pod descriptions alongside docker-compose logs. The artifact name k8s-logs clearly distinguishes it from the non-k8s job's logs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docker-compose.ci.yaml (1)

125-133: Consider using a more robust completion signal.

The sleep 1 pattern ensures the shared directory exists but is fragile. If the directory setup takes longer on slow CI runners, dependent services might fail.

🔎 Alternative approach using a touch file
   shared-ca:
     image: alpine:latest
     profiles: ["full"]
     volumes:
       - shared_ca:/shared_ca
-    command: sh -c "mkdir -p /shared_ca && chmod 777 /shared_ca && sleep 1"
+    command: sh -c "mkdir -p /shared_ca && chmod 777 /shared_ca && touch /shared_ca/.ready && tail -f /dev/null"
     networks:
       - ci-network
+    healthcheck:
+      test: ["CMD", "test", "-f", "/shared_ca/.ready"]
+      interval: 1s
+      timeout: 1s
+      retries: 5

Then update cert-generator to depend on health instead:

     depends_on:
       shared-ca:
-        condition: service_completed_successfully
+        condition: service_healthy
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3613d03 and 0bc30dc.

📒 Files selected for processing (1)
  • docker-compose.ci.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
docker-compose.ci.yaml

[medium] 171-172: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E Tests
  • GitHub Check: Integration Tests
  • GitHub Check: K8s Integration Tests
🔇 Additional comments (9)
docker-compose.ci.yaml (9)

1-15: LGTM! Clear CI documentation.

The header comments provide excellent context on usage patterns and key differences from the production compose file.


27-29: Hardcoded credentials are acceptable for CI.

The MongoDB credentials are hardcoded, which is flagged by static analysis. For CI environments, this is acceptable since they're ephemeral and not exposed to production.


41-54: LGTM! Redis optimized for CI.

The memory limits, disabled persistence (--save ""), and tight healthcheck intervals are appropriate for ephemeral CI workloads.


91-111: LGTM! Schema Registry properly configured.

The service correctly depends on Kafka health, uses internal listener (kafka:29092), and has appropriate resource limits for CI.


135-156: Review the necessity of host networking for cert-generator.

Using network_mode: host bypasses Docker network isolation. This might be required for certificate generation that needs to access the host, but it increases the attack surface in CI.

Verify whether host networking is strictly necessary or if the same functionality can be achieved with the standard bridge network and volume mounts.


158-205: Hardcoded secrets are acceptable for CI environments.

Lines 171 and 178 contain hardcoded MongoDB credentials and SECRET_KEY, which static analysis correctly flags. For ephemeral CI environments, this is acceptable since:

  • The environment is temporary and isolated
  • No production data is involved
  • Credentials are not exposed outside the CI runner

The explicit TESTING=true flag (line 170) clearly marks this as a test configuration.


207-233: LGTM! Frontend properly configured for E2E testing.

The service correctly:

  • Depends on backend health before starting
  • Uses NODE_EXTRA_CA_CERTS to trust the self-signed CA
  • Has appropriate healthcheck with -k flag for self-signed certificates

235-239: LGTM! Standard volume and network configuration.

The shared_ca volume enables certificate sharing across services, and the bridge network provides appropriate isolation for CI.


56-89: Apache Kafka image apache/kafka:3.9.0 exists and is stable.

The official Apache Kafka image version 3.9.0 is available on Docker Hub and confirmed as a stable release.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/backend-ci.yml (1)

178-224: Docker caching logic is duplicated across jobs.

The Docker caching steps (lines 185-224) are nearly identical to the integration job (lines 73-112). While this duplication is necessary since jobs run on separate runners, consider extracting this into a composite action for maintainability if the caching logic needs to evolve.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cf4eb0 and df27af4.

📒 Files selected for processing (2)
  • .github/workflows/backend-ci.yml
  • README.md
🧰 Additional context used
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 138-139: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E Tests
  • GitHub Check: K8s Integration Tests
  • GitHub Check: Unit Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (7)
.github/workflows/backend-ci.yml (7)

18-23: LGTM!

Pinning image versions at the workflow level ensures cache key consistency and reproducible builds across all jobs.


26-64: LGTM!

The unit test job is well-structured with proper UV caching, coverage collection using sysmon core for speedup, and Codecov upload with distinct backend-unit flag.


73-112: Well-designed Docker image caching with parallel operations.

The caching strategy is efficient: parallel pulls/loads with & and wait, and zstd compression for optimal speed-to-size ratio. The cache key correctly includes all image versions for proper invalidation.

One consideration: if an individual docker load fails silently in the background, it won't be caught until compose starts. The failure would still be detected, but the error message might be less clear.


131-151: LGTM!

The test configuration correctly excludes k8s tests for the non-k8s integration path. The MongoDB credentials flagged by static analysis (CKV_SECRET_4) are CI-only test fixtures for ephemeral containers, not production secrets—this is a false positive.


163-176: LGTM!

Targeted log collection for Kafka and Schema Registry alongside the general docker-compose logs will aid debugging of integration test failures.


243-251: LGTM!

The k3s setup is well-structured with proper kubeconfig permissions, timeout for cluster readiness, and idempotent namespace creation using the dry-run pattern.


253-297: LGTM!

The k8s test configuration is well-structured with proper environment setup, distinct Codecov flag (backend-k8s), and comprehensive failure logging including Kubernetes events and pod descriptions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/backend-ci.yml (1)

73-112: Consider extracting Docker caching logic to a reusable workflow.

The Docker image caching implementation is excellent and will significantly speed up CI runs. However, this 40-line block is duplicated identically in the k8s-integration job (lines 185-224).

Consider extracting it into a composite action or reusable workflow to improve maintainability and ensure consistency across both jobs.

💡 Example: Create a composite action

Create .github/actions/docker-cache/action.yml:

name: 'Cache Docker Images'
description: 'Cache and restore Docker images with zstd compression'
runs:
  using: 'composite'
  steps:
    - name: Cache Docker images
      uses: actions/cache@v5
      id: docker-cache
      with:
        path: /tmp/docker-cache
        key: docker-${{ runner.os }}-${{ env.MONGO_IMAGE }}-${{ env.REDIS_IMAGE }}-${{ env.KAFKA_IMAGE }}-${{ env.SCHEMA_REGISTRY_IMAGE }}

    - name: Load cached Docker images
      if: steps.docker-cache.outputs.cache-hit == 'true'
      shell: bash
      run: |
        echo "Loading cached images..."
        for f in /tmp/docker-cache/*.tar.zst; do
          zstd -d -c "$f" | docker load &
        done
        wait
        docker images

    - name: Pull and save Docker images
      if: steps.docker-cache.outputs.cache-hit != 'true'
      shell: bash
      run: |
        mkdir -p /tmp/docker-cache
        echo "Pulling images in parallel..."
        docker pull $MONGO_IMAGE &
        docker pull $REDIS_IMAGE &
        docker pull $KAFKA_IMAGE &
        docker pull $SCHEMA_REGISTRY_IMAGE &
        wait
        echo "Saving images with zstd compression..."
        docker save $MONGO_IMAGE | zstd -T0 -3 > /tmp/docker-cache/mongo.tar.zst &
        docker save $REDIS_IMAGE | zstd -T0 -3 > /tmp/docker-cache/redis.tar.zst &
        docker save $KAFKA_IMAGE | zstd -T0 -3 > /tmp/docker-cache/kafka.tar.zst &
        docker save $SCHEMA_REGISTRY_IMAGE | zstd -T0 -3 > /tmp/docker-cache/schema-registry.tar.zst &
        wait
        echo "Cache size:"
        du -sh /tmp/docker-cache/

Then replace both caching sections with:

- uses: ./.github/actions/docker-cache
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df27af4 and d8988d6.

📒 Files selected for processing (1)
  • .github/workflows/backend-ci.yml
🧰 Additional context used
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 138-139: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E Tests
  • GitHub Check: K8s Integration Tests
  • GitHub Check: Unit Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (8)
.github/workflows/backend-ci.yml (8)

26-64: Well-implemented unit test job with performance optimizations.

The unit test job is cleanly separated from integration tests, includes proper UV caching, and leverages COVERAGE_CORE=sysmon for faster coverage collection. The 5-minute timeout is appropriate for unit tests.


126-129: Excellent improvement: replaced polling with --wait.

The use of docker compose --wait --wait-timeout 120 is a clean replacement for the previous curl polling approach mentioned in the PR objectives. This makes the startup process more reliable and easier to maintain.


134-142: Note: Hardcoded credentials are acceptable for CI.

The static analysis tool flagged the hardcoded MongoDB credentials. While hardcoded credentials are generally a security concern, these are specifically for ephemeral CI environments and pose no security risk. The credentials are:

  • Only used in isolated CI containers
  • Never exposed to production
  • Reset with each test run

131-161: Well-structured integration test execution with proper coverage separation.

The non-k8s integration tests are properly isolated using --ignore=tests/integration/k8s, and the COVERAGE_CORE=sysmon setting aligns with the PR's speed-up objectives. The coverage upload with the backend-integration flag enables proper tracking in Codecov.


243-251: Verify that the 90-second timeout for k3s cluster readiness is sufficient.

The k3s setup looks well-structured with proper kubeconfig handling and namespace creation. However, the 90-second timeout for kubectl cluster-info might be tight on slower runners or during resource contention.

Consider monitoring CI run times to ensure this timeout is consistently adequate. If you observe intermittent failures, consider increasing to 120 seconds to match the docker-compose wait timeout.


253-282: K8s integration tests properly configured with distinct coverage tracking.

The k8s integration tests are cleanly separated with their own job, use the backend-e2e coverage flag (matching the Codecov comment structure in the PR), and properly configure the Kubernetes environment variables. The COVERAGE_CORE=sysmon setting maintains consistency with the other test jobs for performance.


284-297: Comprehensive failure logging for k8s troubleshooting.

The log collection includes both docker-compose logs and Kubernetes-specific artifacts (events and pod descriptions), which will be invaluable for debugging k8s-related test failures.


18-23: All pinned Docker image versions are valid and actively maintained.

The images resolve correctly: mongo:8.0, redis:7-alpine, apache/kafka:3.9.0, and confluentinc/cp-schema-registry:7.5.0 are all official images from their respective registries. The Kafka version is current (released November 2024), addressing the prior concern about invalid image tags.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/tests/unit/schemas_pydantic/test_events_schemas.py (1)

77-77: Consider moving import to module level for consistency.

The import of EventMetadataResponse is local to this function, while all other imports are at the module level. Moving it to the top would improve consistency and make the dependency more discoverable.

🔎 Proposed refactor

Move the import to line 9-22 where other event schema imports are located:

 from app.schemas_pydantic.events import (
     EventAggregationRequest,
     EventBase,
     EventFilterRequest,
     EventInDB,
     EventListResponse,
+    EventMetadataResponse,
     EventProjection,
     EventQuery,
     EventResponse,
     EventStatistics,
     PublishEventRequest,
     PublishEventResponse,
     ResourceUsage,
 )

And remove the local import:

     # Minimal list response compose/decompose
-    from app.schemas_pydantic.events import EventMetadataResponse
     er = EventResponse(
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8988d6 and 98a8347.

📒 Files selected for processing (28)
  • backend/tests/conftest.py
  • backend/tests/integration/app/__init__.py
  • backend/tests/integration/app/test_main_app.py
  • backend/tests/integration/conftest.py
  • backend/tests/integration/core/test_dishka_lifespan.py
  • backend/tests/integration/db/repositories/__init__.py
  • backend/tests/integration/db/repositories/test_admin_events_repository.py
  • backend/tests/integration/db/repositories/test_admin_settings_repository.py
  • backend/tests/integration/db/repositories/test_admin_user_repository.py
  • backend/tests/integration/db/repositories/test_dlq_repository.py
  • backend/tests/integration/db/repositories/test_event_repository.py
  • backend/tests/integration/db/repositories/test_execution_repository.py
  • backend/tests/integration/db/repositories/test_notification_repository.py
  • backend/tests/integration/db/repositories/test_replay_repository.py
  • backend/tests/integration/db/repositories/test_saga_repository.py
  • backend/tests/integration/db/repositories/test_saved_script_repository.py
  • backend/tests/integration/db/repositories/test_sse_repository.py
  • backend/tests/integration/db/repositories/test_user_repository.py
  • backend/tests/integration/db/repositories/test_user_settings_repository.py
  • backend/tests/integration/events/test_admin_utils.py
  • backend/tests/integration/services/sse/__init__.py
  • backend/tests/integration/services/sse/test_redis_bus.py
  • backend/tests/unit/.env.unit
  • backend/tests/unit/conftest.py
  • backend/tests/unit/db/repositories/__init__.py
  • backend/tests/unit/events/test_admin_utils.py
  • backend/tests/unit/infrastructure/mappers/test_replay_api_mapper.py
  • backend/tests/unit/schemas_pydantic/test_events_schemas.py
💤 Files with no reviewable changes (2)
  • backend/tests/unit/infrastructure/mappers/test_replay_api_mapper.py
  • backend/tests/unit/events/test_admin_utils.py
✅ Files skipped from review due to trivial changes (2)
  • backend/tests/integration/services/sse/init.py
  • backend/tests/integration/app/init.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/tests/integration/events/test_admin_utils.py (1)
backend/app/events/admin_utils.py (3)
  • AdminUtils (10-62)
  • ensure_topics_exist (50-58)
  • check_topic_exists (27-34)
backend/tests/unit/schemas_pydantic/test_events_schemas.py (2)
backend/app/schemas_pydantic/events.py (2)
  • EventMetadataResponse (26-37)
  • EventResponse (40-52)
backend/app/domain/events/event_models.py (1)
  • correlation_id (93-94)
🪛 dotenv-linter (4.0.0)
backend/tests/unit/.env.unit

[warning] 2-2: [UnorderedKey] The SECRET_KEY key should go before the TESTING key

(UnorderedKey)


[warning] 3-3: [UnorderedKey] The ENABLE_TRACING key should go before the SECRET_KEY key

(UnorderedKey)


[warning] 4-4: [UnorderedKey] The OTEL_SDK_DISABLED key should go before the SECRET_KEY key

(UnorderedKey)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E Tests
  • GitHub Check: Integration Tests
  • GitHub Check: K8s Integration Tests
🔇 Additional comments (9)
backend/tests/unit/schemas_pydantic/test_events_schemas.py (1)

78-87: Excellent improvement to test data accuracy.

The switch from an empty dict to a properly structured EventMetadataResponse instance makes the test more accurate and aligns with the actual schema definition. All required fields (service_name, service_version, correlation_id) are correctly provided, making the test both realistic and maintainable.

backend/tests/conftest.py (1)

189-191: LGTM! Clear documentation of test fixture reorganization.

The comments effectively explain the separation of concerns: integration tests handle cleanup via their own conftest, while unit tests use guards to prevent infrastructure access.

backend/tests/integration/conftest.py (1)

1-24: LGTM! Solid integration test isolation strategy.

The autouse cleanup fixture ensures each integration test runs with a clean MongoDB and Redis state, filtering out system collections appropriately.

backend/tests/integration/events/test_admin_utils.py (2)

19-20: LGTM!

The topic existence check properly validates that the topic was created/exists.


15-17: Weak assertion doesn't validate anything meaningful.

The assertion assert res.get(topic) in (True, False) will pass even if res.get(topic) returns None. This doesn't verify that the topic was actually processed by ensure_topics_exist.

🔎 Proposed fix
-    # Ensure topic exists (idempotent)
-    res = await au.ensure_topics_exist([(topic, 1)])
-    assert res.get(topic) in (True, False)  # Some clusters may report exists
+    # Ensure topic exists (idempotent)
+    res = await au.ensure_topics_exist([(topic, 1)])
+    assert topic in res, f"Topic {topic} should be in results"

Likely an incorrect or invalid review comment.

backend/tests/unit/.env.unit (1)

1-4: LGTM! Appropriate unit test environment configuration.

The environment variables correctly configure unit test mode and disable tracing/telemetry.

backend/tests/unit/conftest.py (3)

1-9: LGTM! Proper unit test environment setup.

Loading .env.unit with override=True ensures unit test configuration takes precedence.


12-15: LGTM! No-op cleanup provides fixture compatibility.

The autouse=False no-op _cleanup fixture allows tests that explicitly depend on _cleanup to work in unit test context without performing unnecessary cleanup operations.


18-35: LGTM! Excellent enforcement of unit test boundaries.

The guard fixtures correctly prevent unit tests from accessing infrastructure dependencies, with clear error messages directing developers to use mocks or integration tests.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/tests/unit/services/pod_monitor/test_monitor.py (2)

202-202: Consider whether substring check is intentional.

The assertion changed from an exact match to a case-insensitive substring check. While this makes the test more flexible, it could pass for unexpected state values like "RUNNING_IDLE" or "PENDING_IDLE".

If the state value format is stable, an exact comparison might catch regressions more reliably:

assert status["state"] == "IDLE"  # or whatever the exact expected value is

560-560: Tests verify graceful handling without assertions.

These tests validate that methods handle edge cases (already running/stopped, malformed input) without raising exceptions. The lack of explicit assertions is intentional—the test passes if no exception occurs.

For clarity, you could optionally add brief comments:

await pm.start()  # Should not raise when already running

Also applies to: 569-569, 605-605

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98a8347 and 2ec7474.

📒 Files selected for processing (1)
  • backend/tests/unit/services/pod_monitor/test_monitor.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/unit/services/pod_monitor/test_monitor.py (1)
backend/app/services/pod_monitor/monitor.py (3)
  • PodMonitor (132-558)
  • create_pod_monitor (562-574)
  • _update_resource_version (322-328)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E Tests
  • GitHub Check: K8s Integration Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (3)
backend/tests/unit/services/pod_monitor/test_monitor.py (3)

13-43: Good centralization of K8s stubs.

The shared stub classes effectively reduce duplication across tests. The minimal implementations are appropriate for unit testing where the focus is on the monitor's logic rather than K8s client behavior.


74-81: Excellent refactoring of K8s patching logic.

The _patch_k8s helper centralizes monkeypatching with sensible defaults while allowing test-specific overrides. This significantly reduces boilerplate and makes test setup more maintainable.


109-109: Consistent use of the new patching helper.

The tests correctly leverage _patch_k8s with appropriate overrides where needed, maintaining clear test intent while reducing setup boilerplate.

Also applies to: 541-541, 650-650, 675-675

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
backend/tests/integration/db/repositories/test_user_repository.py (1)

12-12: Fix the MyPy type checking failure for the untyped db parameter.

The pipeline failure indicates MyPy's strict mode is rejecting the untyped db parameter. The # type: ignore[valid-type] comment is likely using the wrong error code for this scenario.

Given the commit message mentions "motor -> pymongo", ensure you're using the correct type annotation. Consider one of these solutions:

  1. Add proper type annotation (preferred):

    from pymongo.database import Database
    
    async def test_create_get_update_delete_user(db: Database) -> None:
  2. If the type is complex or fixture-dependent, use the correct mypy error code:

    async def test_create_get_update_delete_user(db) -> None:  # type: ignore[no-untyped-def]
backend/workers/run_saga_orchestrator.py (1)

114-114: Critical: close() must be awaited.

MyPy correctly identifies that db_client.close() returns a coroutine that is not being awaited, which means the database connection may not close properly.

🔎 Proposed fix
-        db_client.close()
+        await db_client.close()
backend/app/services/k8s_worker/worker.py (1)

580-580: Use stack.push_async_callback() instead of stack.callback() for async MongoDB client cleanup.

AsyncMongoClient.close() is an async coroutine in PyMongo 4.9.2 and must be awaited. The code at line 580 uses stack.callback() (for synchronous functions), but should use stack.push_async_callback() to properly handle the async cleanup, consistent with the other async callbacks on lines 578–579.

backend/workers/run_event_replay.py (1)

71-71: Critical: Use push_async_callback instead of callback for async close().

PyMongo's AsyncMongoClient.close() is now a coroutine that must be awaited. Using stack.callback() will not await the coroutine, causing a resource leak and MyPy error.

🔎 Proposed fix
-        stack.callback(db_client.close)
+        stack.push_async_callback(db_client.close)
backend/tests/fixtures/real_services.py (2)

96-101: Critical: Missing await on client.close().

PyMongo's AsyncMongoClient.close() is a coroutine that must be awaited. The synchronous call will cause a resource leak and MyPy error.

🔎 Proposed fix
         # Drop test MongoDB database
         if self.mongo_client:
             await self.mongo_client.drop_database(self.db_name)
-            self.mongo_client.close()
+            await self.mongo_client.close()

314-320: Critical: Missing await on client.close().

PyMongo's AsyncMongoClient.close() is a coroutine that must be awaited.

🔎 Proposed fix
         client = AsyncIOMotorClient(
             "mongodb://root:rootpassword@localhost:27017",
             serverSelectionTimeoutMS=5000
         )
         await client.admin.command("ping")
-        client.close()
+        await client.close()
backend/scripts/seed_users.py (1)

107-107: Critical: Missing await on client.close().

PyMongo's AsyncMongoClient.close() returns a coroutine that must be awaited. The pipeline failure explicitly flags this: "Value of type 'Coroutine[Any, Any, None]' must be used. Are you missing an await?"

🔎 Proposed fix
-    client.close()
+    await client.close()
backend/app/core/database_context.py (1)

158-172: Critical: Remove incorrect await on start_session().

PyMongo's client.start_session() returns an AsyncClientSession context manager directly, not a coroutine. The pipeline failure explicitly flags this: "Incompatible types in 'await' (actual type 'AsyncClientSession', expected type 'Awaitable[Any]')".

🔎 Proposed fix
-        async with await self.client.start_session() as session:
+        async with self.client.start_session() as session:
             async with session.start_transaction():
                 yield session
backend/app/services/coordinator/coordinator.py (1)

543-548: Critical: Use push_async_callback instead of callback for async close().

PyMongo's AsyncMongoClient.close() is a coroutine that must be awaited. Using stack.callback() will not await the coroutine, causing a resource leak and contributing to the MyPy type checking failure.

🔎 Proposed fix
         await stack.enter_async_context(coordinator)
         stack.push_async_callback(idem_manager.close)
         stack.push_async_callback(r.aclose)
-        stack.callback(db_client.close)
+        stack.push_async_callback(db_client.close)
🧹 Nitpick comments (4)
backend/tests/integration/events/test_event_store.py (1)

5-10: Import ordering: third-party import after local imports.

The pymongo import at line 10 is placed after the app.* imports. Standard Python convention (PEP 8) groups imports as: stdlib → third-party → local. Consider moving the pymongo import before the app imports for consistency.

🔎 Suggested reordering
 from datetime import datetime, timezone, timedelta
 
 import pytest
+from pymongo.asynchronous.database import AsyncDatabase as AsyncIOMotorDatabase
 
 from app.events.event_store import EventStore
 from app.events.schema.schema_registry import SchemaRegistryManager
 from app.infrastructure.kafka.events.metadata import AvroEventMetadata
 from app.infrastructure.kafka.events.pod import PodCreatedEvent
 from app.infrastructure.kafka.events.user import UserLoggedInEvent
-from pymongo.asynchronous.database import AsyncDatabase as AsyncIOMotorDatabase
backend/tests/integration/conftest.py (1)

10-24: Consider extracting duplicate cleanup logic.

The pre-test (lines 11-15) and post-test (lines 20-24) cleanup logic is identical. Extracting to a helper improves maintainability.

🔎 Suggested refactor
+async def _do_cleanup(db: AsyncIOMotorDatabase, redis_client: redis.Redis) -> None:
+    collections = await db.list_collection_names()
+    for name in collections:
+        if not name.startswith("system."):
+            await db.drop_collection(name)
+    await redis_client.flushdb()
+
+
 @pytest_asyncio.fixture(scope="function", autouse=True)
 async def _cleanup(db: AsyncIOMotorDatabase, redis_client: redis.Redis):
     """Clean DB and Redis before/after each integration test."""
-    # Pre-test cleanup
-    collections = await db.list_collection_names()
-    for name in collections:
-        if not name.startswith("system."):
-            await db.drop_collection(name)
-    await redis_client.flushdb()
+    await _do_cleanup(db, redis_client)
 
     yield
 
-    # Post-test cleanup
-    collections = await db.list_collection_names()
-    for name in collections:
-        if not name.startswith("system."):
-            await db.drop_collection(name)
-    await redis_client.flushdb()
+    await _do_cleanup(db, redis_client)
.github/workflows/backend-ci.yml (2)

73-112: Consider extracting Docker caching to a composite action.

The Docker image caching logic (cache restore, parallel pull, zstd save) is duplicated between the integration and e2e jobs. Extracting this to a composite action would reduce duplication and simplify maintenance.

Also applies to: 185-224


85-88: Background docker load errors may be silently ignored.

When loading images in parallel with &, if any docker load command fails, the error is not captured before wait. Consider adding error handling or using set -e at the script start to fail on first error.

🔎 Suggested fix
       - name: Load cached Docker images
         if: steps.docker-cache.outputs.cache-hit == 'true'
         run: |
+          set -e
           echo "Loading cached images..."
+          pids=()
           for f in /tmp/docker-cache/*.tar.zst; do
-            zstd -d -c "$f" | docker load &
+            zstd -d -c "$f" | docker load &
+            pids+=($!)
           done
-          wait
+          for pid in "${pids[@]}"; do
+            wait "$pid" || exit 1
+          done
           docker images
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec7474 and a5949f5.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (36)
  • .github/workflows/backend-ci.yml
  • backend/app/core/database_context.py
  • backend/app/services/coordinator/coordinator.py
  • backend/app/services/k8s_worker/worker.py
  • backend/pyproject.toml
  • backend/scripts/seed_users.py
  • backend/tests/conftest.py
  • backend/tests/fixtures/real_services.py
  • backend/tests/integration/app/test_main_app.py
  • backend/tests/integration/conftest.py
  • backend/tests/integration/core/test_container.py
  • backend/tests/integration/core/test_database_context.py
  • backend/tests/integration/db/repositories/test_admin_events_repository.py
  • backend/tests/integration/db/repositories/test_admin_settings_repository.py
  • backend/tests/integration/db/repositories/test_admin_user_repository.py
  • backend/tests/integration/db/repositories/test_dlq_repository.py
  • backend/tests/integration/db/repositories/test_event_repository.py
  • backend/tests/integration/db/repositories/test_notification_repository.py
  • backend/tests/integration/db/repositories/test_replay_repository.py
  • backend/tests/integration/db/repositories/test_saga_repository.py
  • backend/tests/integration/db/repositories/test_saved_script_repository.py
  • backend/tests/integration/db/repositories/test_sse_repository.py
  • backend/tests/integration/db/repositories/test_user_repository.py
  • backend/tests/integration/db/repositories/test_user_settings_repository.py
  • backend/tests/integration/events/test_event_store.py
  • backend/tests/integration/events/test_event_store_consumer.py
  • backend/tests/integration/events/test_event_store_consumer_flush_e2e.py
  • backend/tests/integration/events/test_event_store_e2e.py
  • backend/tests/integration/k8s/test_k8s_worker_create_pod.py
  • backend/tests/integration/result_processor/test_result_processor.py
  • backend/tests/integration/services/admin/test_admin_user_service.py
  • backend/tests/integration/services/saved_script/test_saved_script_service.py
  • backend/tests/integration/services/sse/test_redis_bus.py
  • backend/workers/dlq_processor.py
  • backend/workers/run_event_replay.py
  • backend/workers/run_saga_orchestrator.py
💤 Files with no reviewable changes (1)
  • backend/pyproject.toml
✅ Files skipped from review due to trivial changes (1)
  • backend/tests/integration/db/repositories/test_admin_user_repository.py
🧰 Additional context used
🧬 Code graph analysis (13)
backend/tests/integration/events/test_event_store_consumer.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/events/test_event_store.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/services/sse/test_redis_bus.py (3)
backend/app/schemas_pydantic/sse.py (2)
  • RedisSSEMessage (63-68)
  • RedisNotificationMessage (102-112)
backend/app/dlq/models.py (1)
  • event_type (93-95)
backend/app/services/coordinator/queue_manager.py (1)
  • execution_id (30-31)
backend/tests/integration/events/test_event_store_consumer_flush_e2e.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/events/test_event_store_e2e.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/k8s/test_k8s_worker_create_pod.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/conftest.py (3)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/conftest.py (3)
  • scope (172-174)
  • db (178-180)
  • redis_client (184-186)
backend/tests/unit/conftest.py (3)
  • _cleanup (13-15)
  • db (19-20)
  • redis_client (24-25)
backend/tests/integration/services/admin/test_admin_user_service.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/result_processor/test_result_processor.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/conftest.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/core/test_database_context.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/core/test_container.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/fixtures/real_services.py (1)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 138-139: Basic Auth Credentials

(CKV_SECRET_4)

🪛 GitHub Actions: MyPy Type Checking
backend/tests/integration/db/repositories/test_sse_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/events/test_event_store_consumer.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_user_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_saved_script_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/events/test_event_store.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

.github/workflows/backend-ci.yml

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/services/sse/test_redis_bus.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_admin_events_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/events/test_event_store_consumer_flush_e2e.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_event_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_admin_settings_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/workers/run_saga_orchestrator.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'


[error] 114-114: Value of type "Coroutine[Any, Any, None]" must be used [unused-coroutine]. Are you missing an await?

backend/tests/integration/app/test_main_app.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/workers/run_event_replay.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/app/core/database_context.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'


[error] 170-170: Incompatible types in "await" (actual type "AsyncClientSession", expected type "Awaitable[Any]") [misc]

backend/app/services/coordinator/coordinator.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_replay_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/events/test_event_store_e2e.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/k8s/test_k8s_worker_create_pod.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/conftest.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/services/admin/test_admin_user_service.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_user_settings_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/result_processor/test_result_processor.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/scripts/seed_users.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'


[error] 107-107: Value of type "Coroutine[Any, Any, None]" must be used [unused-coroutine]. Are you missing an await?

backend/tests/conftest.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_dlq_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/services/saved_script/test_saved_script_service.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/core/test_database_context.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/app/services/k8s_worker/worker.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/core/test_container.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_notification_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/workers/dlq_processor.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/fixtures/real_services.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

backend/tests/integration/db/repositories/test_saga_repository.py

[error] 1-1: mypy check failed in CI step: 'uv run mypy --config-file pyproject.toml --strict .'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Scan Backend
  • GitHub Check: Scan Frontend
  • GitHub Check: E2E Tests
  • GitHub Check: E2E Tests
🔇 Additional comments (46)
backend/tests/integration/db/repositories/test_user_repository.py (1)

8-8: LGTM! Correct reclassification from unit to integration test.

This change appropriately reflects that the test performs actual database operations, making it an integration test rather than a unit test.

backend/tests/integration/db/repositories/test_saga_repository.py (2)

9-9: LGTM! Test marker updated appropriately.

The change from pytest.mark.unit to pytest.mark.integration correctly reflects that this test requires database fixtures and tests repository integration logic.


27-28: No changes needed. The return type change from list to SagaListResult is properly implemented with correct return type annotation, and all usages across the codebase have been consistently updated to access .sagas. The assertion change at line 28 is correct.

backend/tests/integration/db/repositories/test_notification_repository.py (2)

15-15: LGTM! Correct test categorization.

The marker change from unit to integration is appropriate. These tests interact with a database through the db fixture and perform actual I/O operations on MongoDB collections, which clearly qualifies them as integration tests rather than unit tests.


1-1: The datetime.UTC import is valid for the target Python version.

The import from datetime import datetime, UTC, timedelta is syntactically correct for Python 3.12+ (UTC was added to the datetime module in Python 3.11). The usage of UTC in the code is also correct (e.g., datetime.now(UTC) on lines 50, 67, 90). If mypy is reporting a failure in this file, the issue is likely not related to this import statement. Please share the specific mypy error message from the CI pipeline to investigate further.

backend/tests/integration/k8s/test_k8s_worker_create_pod.py (2)

36-43: No action needed. The KubernetesWorker constructor already correctly accepts Database (which is AsyncDatabase[MongoDocument] from pymongo's async API). The type is properly defined and matches the test fixture being passed.

Likely an incorrect or invalid review comment.


29-29: No action required. The test file correctly retrieves the database from the dependency injection container. All integration tests consistently use this pattern (scope.get(AsyncIOMotorDatabase)), including the existing db fixture in conftest.py, which confirms that the DI container properly handles both the type alias and generic type resolution. The retrieval is functional and consistent with the codebase.

backend/tests/integration/db/repositories/test_admin_settings_repository.py (1)

6-6: LGTM! Test categorization is correct.

Reclassifying this as an integration test is appropriate since it interacts with a real database instance via the db fixture.

backend/tests/integration/db/repositories/test_saved_script_repository.py (1)

6-6: LGTM! Correct test categorization.

The integration marker properly reflects that these tests require database connectivity.

backend/tests/integration/events/test_event_store_e2e.py (1)

4-4: LGTM! Clean migration to PyMongo's async API.

The import alias maintains backward compatibility while migrating to PyMongo's asynchronous database type.

backend/tests/integration/services/sse/test_redis_bus.py (2)

7-7: LGTM! Appropriate test categorization.

Integration marker correctly reflects the Redis-based testing scope.


80-80: LGTM! Cleanup of redundant timeout arguments.

Removing the explicit timeout parameter is fine since _FakePubSub.get_message() defaults to timeout=0.5 on line 33, maintaining the same behavior.

Also applies to: 86-86, 114-114

backend/workers/run_saga_orchestrator.py (1)

20-20: LGTM! Consistent migration to PyMongo's async API.

The import and instantiation of AsyncMongoClient align with the repository-wide migration from Motor to PyMongo's asynchronous API.

Also applies to: 30-30

backend/tests/integration/db/repositories/test_replay_repository.py (1)

11-11: LGTM! Correct test categorization.

Reclassifying as integration is appropriate for tests that interact with the database and event collections.

backend/app/services/k8s_worker/worker.py (1)

12-12: LGTM! Consistent PyMongo async client migration.

The import and instantiation properly migrate to PyMongo's AsyncMongoClient.

Also applies to: 524-524

backend/workers/dlq_processor.py (2)

11-11: LGTM! Proper PyMongo async client migration.

The import and instantiation correctly use PyMongo's AsyncMongoClient.

Also applies to: 103-107


136-136: Verify async close() handling in AsyncExitStack.

Similar to worker.py and the critical issue in run_saga_orchestrator.py, using stack.callback() for db_client.close() may be incorrect if close() is async and returns a coroutine. Consider using stack.push_async_callback() instead if the verification confirms close() is async.

backend/workers/run_event_replay.py (2)

15-15: LGTM! Migration to PyMongo async API.

The import change from Motor to PyMongo's asynchronous API is correct.


39-39: LGTM! Client instantiation is correct.

The AsyncMongoClient instantiation preserves the correct connection parameters.

backend/tests/integration/db/repositories/test_user_settings_repository.py (1)

9-9: LGTM! Correct test marker.

The pytest marker correctly reflects that this is an integration test, aligning with the file's location in tests/integration/.

backend/tests/integration/services/admin/test_admin_user_service.py (1)

4-4: LGTM! Clean migration with aliasing.

The import change migrates to PyMongo's AsyncDatabase while preserving the AsyncIOMotorDatabase symbol for backward compatibility in tests.

backend/tests/fixtures/real_services.py (1)

12-13: LGTM! Clean migration with aliasing.

The import changes migrate to PyMongo's asynchronous API while preserving Motor-compatible symbol names through aliasing.

backend/tests/integration/events/test_event_store_consumer_flush_e2e.py (1)

5-5: LGTM! Clean migration with aliasing.

The import change migrates to PyMongo's AsyncDatabase while preserving the AsyncIOMotorDatabase symbol for backward compatibility.

backend/scripts/seed_users.py (3)

22-23: LGTM! Migration to PyMongo async API.

The import changes correctly migrate to PyMongo's asynchronous API.


29-29: LGTM! Type signature updated correctly.

The function signature correctly uses PyMongo's AsyncDatabase type.


77-77: LGTM! Client instantiation is correct.

The AsyncMongoClient instantiation is correct.

backend/app/core/database_context.py (5)

7-11: LGTM! Migration to PyMongo async API.

The import changes correctly migrate to PyMongo's asynchronous API modules.


17-23: LGTM! Type aliases updated correctly.

The type aliases are correctly updated to use PyMongo's asynchronous types while maintaining clear naming.


107-118: LGTM! Client initialization is correct.

The comment accurately reflects PyMongo's automatic event loop usage, and the client instantiation preserves all necessary connection parameters.


120-127: LGTM! Error handling correctly awaits close().

The error path correctly awaits client.close() since PyMongo's close method is a coroutine.


132-137: LGTM! Disconnect correctly awaits close().

The disconnect method correctly awaits self._client.close().

backend/app/services/coordinator/coordinator.py (2)

9-9: LGTM! Migration to PyMongo async API.

The import change correctly migrates to PyMongo's asynchronous AsyncMongoClient.


503-503: LGTM! Client instantiation is correct.

The AsyncMongoClient instantiation preserves the correct connection parameters.

backend/tests/integration/db/repositories/test_event_repository.py (1)

9-9: LGTM!

The marker reclassification from unit to integration aligns with the test reorganization objectives.

backend/tests/integration/db/repositories/test_dlq_repository.py (1)

9-9: LGTM!

The marker reclassification from unit to integration is consistent with the broader test suite reorganization.

backend/tests/integration/db/repositories/test_sse_repository.py (1)

6-6: LGTM!

The marker change appropriately reclassifies this test as integration-level.

backend/tests/integration/app/test_main_app.py (1)

15-15: LGTM!

The marker update correctly categorizes this test as integration-level, consistent with the test suite reorganization.

backend/tests/integration/result_processor/test_result_processor.py (1)

6-6: Remove or verify with actual MyPy output.

The AsyncIOMotorDatabase alias is consistently used across 12 test files (conftest.py, multiple integration test files, and fixtures). Without concrete MyPy error output showing that this alias causes type checking failures, the claim that it "correlates with reported MyPy failures" cannot be verified. If MyPy does report errors specifically on this alias, provide those error messages; otherwise, this pattern appears to be intentional and working.

backend/tests/integration/core/test_database_context.py (1)

4-4: The import alias AsyncDatabase as AsyncIOMotorDatabase is correct and does not cause type-checking issues. The mypy configuration in pyproject.toml explicitly excludes all test files (exclude = '(^tests/|/tests/')), so test files are not type-checked by the pipeline. The alias is straightforward and used consistently across the test suite without complications.

Likely an incorrect or invalid review comment.

backend/tests/integration/events/test_event_store_consumer.py (1)

6-6: The import alias is not causing any MyPy failures. MyPy's configuration explicitly excludes all test files (exclude = '(^tests/|/tests/)'), so this file is not analyzed by MyPy at all. The import from pymongo.asynchronous.database import AsyncDatabase as AsyncIOMotorDatabase is valid, and the alias is used correctly and consistently throughout the file. No changes are needed.

Likely an incorrect or invalid review comment.

backend/tests/integration/db/repositories/test_admin_events_repository.py (1)

12-12: LGTM!

The test marker correctly reflects that this file is an integration test (located under tests/integration/). This aligns with the PR's goal of properly categorizing tests.

backend/tests/conftest.py (2)

14-14: LGTM!

The import migration from motor to pymongo's async API is consistent with the broader codebase migration.


189-191: Good documentation of the cleanup migration.

Clear comments explaining that autouse cleanup moved to tests/integration/conftest.py and unit tests use tests/unit/conftest.py. This helps future maintainers understand the test architecture.

.github/workflows/backend-ci.yml (3)

134-144: Hardcoded credentials are acceptable for CI test infrastructure.

The static analysis tool flagged the MongoDB credentials (CKV_SECRET_4). These are local test infrastructure credentials used only in CI, which is a common and acceptable pattern. No action needed.


243-251: k3s setup looks solid.

The Kubernetes setup with timeout, namespace creation using dry-run with apply (idempotent), and proper KUBECONFIG handling is well-designed for CI reliability.


284-296: Good failure diagnostics collection.

Collecting docker-compose logs, k8s events, and pod descriptions on failure provides excellent debugging context. The || true prevents log collection failures from masking the actual test failure.

Comment on lines 7 to +9
pytestmark = [pytest.mark.integration, pytest.mark.mongodb]

pytestmark = pytest.mark.unit
pytestmark = pytest.mark.integration
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove duplicate pytestmark assignment.

Line 9 overwrites line 7, losing the pytest.mark.mongodb marker. This prevents proper test collection and filtering.

🔎 Proposed fix
 pytestmark = [pytest.mark.integration, pytest.mark.mongodb]
-
-pytestmark = pytest.mark.integration
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pytestmark = [pytest.mark.integration, pytest.mark.mongodb]
pytestmark = pytest.mark.unit
pytestmark = pytest.mark.integration
pytestmark = [pytest.mark.integration, pytest.mark.mongodb]
🤖 Prompt for AI Agents
In backend/tests/integration/services/saved_script/test_saved_script_service.py
around lines 7 to 9, there is a duplicate pytestmark assignment where line 9
overwrites the earlier pytest.mark.mongodb marker; remove the second assignment
and consolidate so pytestmark includes both markers (e.g., a single pytestmark
list containing pytest.mark.integration and pytest.mark.mongodb) so the mongodb
marker is preserved for test collection.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/tests/fixtures/real_services.py (2)

101-101: Missing await on AsyncMongoClient.close().

Line 101 calls self.mongo_client.close() without await. Since the type changed from Motor's AsyncIOMotorClient to PyMongo's AsyncMongoClient (line 26, 34), the close() method now returns a coroutine that must be awaited.

🔎 Proposed fix
-            self.mongo_client.close()
+            await self.mongo_client.close()

168-169: Fix database_context.py to follow PyMongo async API conventions.

The patterns differ because backend/app/core/database_context.py (lines 170-171) uses an incorrect async pattern:

async with self.client.start_session() as session:
    async with await session.start_transaction():

This violates PyMongo async API requirements: start_session() must be awaited, but start_transaction() is not awaitable. The correct pattern (already used in backend/tests/fixtures/real_services.py lines 168-169) is:

async with await client.start_session() as session:
    async with session.start_transaction():

Update database_context.py to match this correct pattern.

♻️ Duplicate comments (1)
backend/tests/integration/k8s/test_k8s_worker_create_pod.py (1)

12-12: LGTM! Test fixture correctly updated to Database type.

The migration from AsyncIOMotorDatabase to Database aligns with the DI container's type registrations and the broader codebase migration to PyMongo async types.

Also applies to: 29-29

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a5949f5 and e5f43ef.

📒 Files selected for processing (21)
  • backend/app/core/database_context.py
  • backend/app/db/repositories/admin/admin_events_repository.py
  • backend/app/db/repositories/dlq_repository.py
  • backend/app/db/repositories/event_repository.py
  • backend/app/db/repositories/execution_repository.py
  • backend/app/db/repositories/notification_repository.py
  • backend/app/db/repositories/saga_repository.py
  • backend/app/dlq/manager.py
  • backend/app/events/event_store.py
  • backend/tests/conftest.py
  • backend/tests/fixtures/real_services.py
  • backend/tests/integration/conftest.py
  • backend/tests/integration/core/test_container.py
  • backend/tests/integration/core/test_database_context.py
  • backend/tests/integration/events/test_event_store.py
  • backend/tests/integration/events/test_event_store_consumer.py
  • backend/tests/integration/events/test_event_store_consumer_flush_e2e.py
  • backend/tests/integration/events/test_event_store_e2e.py
  • backend/tests/integration/k8s/test_k8s_worker_create_pod.py
  • backend/tests/integration/result_processor/test_result_processor.py
  • backend/tests/integration/services/admin/test_admin_user_service.py
✅ Files skipped from review due to trivial changes (1)
  • backend/app/db/repositories/notification_repository.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • backend/tests/integration/events/test_event_store.py
  • backend/tests/integration/conftest.py
  • backend/tests/integration/events/test_event_store_consumer_flush_e2e.py
  • backend/tests/integration/core/test_container.py
🧰 Additional context used
🧬 Code graph analysis (9)
backend/tests/integration/services/admin/test_admin_user_service.py (1)
backend/tests/conftest.py (3)
  • app (130-138)
  • db (178-180)
  • scope (172-174)
backend/app/db/repositories/admin/admin_events_repository.py (2)
backend/app/domain/events/query_builders.py (3)
  • EventStatsAggregation (59-125)
  • build_hourly_events_pipeline (94-101)
  • build_top_users_pipeline (104-110)
backend/app/domain/events/event_models.py (1)
  • HourlyEventCount (168-170)
backend/tests/integration/events/test_event_store_consumer.py (1)
backend/tests/conftest.py (3)
  • app (130-138)
  • db (178-180)
  • scope (172-174)
backend/tests/conftest.py (2)
backend/tests/unit/conftest.py (2)
  • app (34-35)
  • db (19-20)
backend/app/core/database_context.py (3)
  • database (67-69)
  • database (146-149)
  • database (201-202)
backend/tests/integration/events/test_event_store_e2e.py (1)
backend/tests/conftest.py (3)
  • app (130-138)
  • db (178-180)
  • scope (172-174)
backend/tests/integration/result_processor/test_result_processor.py (1)
backend/tests/conftest.py (3)
  • app (130-138)
  • db (178-180)
  • scope (172-174)
backend/tests/fixtures/real_services.py (2)
backend/tests/conftest.py (4)
  • app (130-138)
  • redis_client (184-186)
  • db (178-180)
  • client (150-161)
backend/app/core/database_context.py (6)
  • db_name (72-74)
  • db_name (152-153)
  • db_name (205-206)
  • client (62-64)
  • client (140-143)
  • client (197-198)
backend/app/db/repositories/execution_repository.py (2)
backend/app/domain/execution/models.py (2)
  • DomainExecution (13-26)
  • ResourceUsageDomain (43-64)
backend/app/services/coordinator/queue_manager.py (1)
  • execution_id (30-31)
backend/tests/integration/core/test_database_context.py (1)
backend/tests/conftest.py (3)
  • app (130-138)
  • db (178-180)
  • scope (172-174)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E Tests
  • GitHub Check: E2E Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (27)
backend/app/db/repositories/execution_repository.py (2)

133-133: LGTM: Variable extraction improves readability.

Extracting resource_usage_data to a local variable is a good refactoring that improves code clarity and matches the pattern used in get_execution (line 58).


136-136: Defensive str() coercion is acceptable.

The explicit str() coercion ensures type consistency for execution_id, which is appropriate given the Motor → PyMongo migration context. While the fallback "" is already a string, this defensive approach guards against unexpected types from the database.

backend/tests/integration/result_processor/test_result_processor.py (1)

6-6: LGTM! Clean migration to the Database abstraction.

The import and type annotation changes correctly align with the project-wide migration from Motor's AsyncIOMotorDatabase to the internal Database abstraction. The DI resolution is updated appropriately.

Also applies to: 33-33

backend/app/dlq/manager.py (1)

337-339: LGTM! Correct PyMongo async aggregation pattern.

The change correctly awaits the aggregate() call before iterating over the cursor. This aligns with PyMongo's async API where aggregate() returns a coroutine that resolves to a cursor, unlike Motor's direct cursor return.

backend/tests/integration/services/admin/test_admin_user_service.py (1)

4-4: LGTM! Type migration correctly applied.

The import and DI resolution changes properly migrate to the Database abstraction. The test logic remains intact.

Also applies to: 15-15

backend/tests/integration/core/test_database_context.py (1)

4-4: LGTM! Database abstraction correctly integrated.

The test properly validates the Database type resolution from the DI container, aligning with the project-wide migration.

Also applies to: 13-13

backend/tests/integration/events/test_event_store_consumer.py (1)

6-6: LGTM! Type migration consistent with the broader changes.

The import and DI resolution correctly migrate to the Database abstraction.

Also applies to: 28-28

backend/app/events/event_store.py (1)

304-304: LGTM! Correct async aggregation pattern.

The change to await aggregate() before iterating aligns with PyMongo's async API where aggregate() returns a coroutine that resolves to a cursor.

backend/app/db/repositories/saga_repository.py (3)

96-97: LGTM! Consistent async aggregation pattern.

The change correctly awaits aggregate() before iterating, matching the PyMongo async API pattern applied throughout this PR.


126-127: LGTM! Async aggregation pattern correctly applied.


137-139: LGTM! Async aggregation pattern correctly applied.

backend/app/db/repositories/dlq_repository.py (5)

38-39: LGTM! Async aggregation pattern correctly applied.

The change awaits aggregate() before iterating over the cursor, following PyMongo's async API pattern.


61-64: LGTM! Async aggregation pattern correctly applied.


74-76: LGTM! Async aggregation pattern correctly applied.


97-99: LGTM! Two-step async aggregation pattern correctly applied.

The change splits the operation into awaiting aggregate() to get the cursor, then awaiting to_list() to convert results. This is the correct pattern for PyMongo's async API.


152-167: LGTM! Async aggregation pattern correctly applied.

The change awaits aggregate() before iterating, consistent with the PyMongo async migration throughout this PR.

backend/app/db/repositories/event_repository.py (3)

237-238: LGTM! Aggregation cursor handling correctly updated for PyMongo async.

The explicit cursor handling pattern (cursor = await self._collection.aggregate(pipeline) followed by await cursor.to_list(length=1)) correctly adapts to PyMongo's async API where aggregate() must be awaited to obtain a cursor.

Also applies to: 300-301


326-328: LGTM! Change stream usage correctly updated.

The pattern async with await self._collection.watch(...) correctly reflects PyMongo's async API where watch() returns an awaitable that yields an async context manager.


445-445: LGTM! Async iteration over aggregation cursors correctly implemented.

The pattern async for doc in await self._collection.aggregate(pipeline) properly awaits the aggregation to get the cursor before iterating.

Also applies to: 458-458

backend/app/db/repositories/admin/admin_events_repository.py (2)

123-124: LGTM! Aggregation patterns correctly standardized.

The explicit cursor handling (cursor = await self.events_collection.aggregate(pipeline) followed by await cursor.to_list(...)) correctly implements PyMongo's async API and aligns with patterns used in other repository files.

Also applies to: 144-145, 180-181


150-162: LGTM! Async iteration and defensive filtering.

The async iteration over aggregation cursors is correctly implemented, and the filter on line 161 to exclude None user_ids is good defensive coding.

backend/tests/integration/events/test_event_store_e2e.py (1)

4-4: LGTM! Test correctly migrated to Database abstraction.

The import and type annotation updates align with the Motor → PyMongo migration, using the new Database type alias from app.core.database_context.

Also applies to: 18-18

backend/tests/conftest.py (2)

14-14: LGTM! Test fixture correctly migrated to Database abstraction.

The db fixture now returns Database instead of AsyncIOMotorDatabase, aligning with the DI container's provider and the PyMongo migration.

Also applies to: 178-179


189-191: LGTM! Cleanup responsibility clarified.

Moving the autouse cleanup to tests/integration/conftest.py appropriately restricts database cleanup to integration tests only, avoiding unnecessary DB requirements for unit tests.

backend/app/core/database_context.py (2)

7-23: LGTM! Type aliases correctly migrated to PyMongo async types.

The migration from Motor types to PyMongo async equivalents is comprehensive and consistent:

  • DBClient: AsyncMongoClient[MongoDocument]
  • Database: AsyncDatabase[MongoDocument]
  • Collection: AsyncCollection[MongoDocument]
  • Cursor: AsyncCursor[MongoDocument]
  • DBSession: AsyncClientSession

107-108: LGTM! Connection lifecycle correctly updated for PyMongo async.

The removal of explicit io_loop binding (PyMongo async uses the current event loop implicitly) and the addition of await for client.close() calls correctly adapt to PyMongo's async API.

Also applies to: 126-126, 135-135

backend/tests/fixtures/real_services.py (1)

15-15: LGTM! Test fixtures correctly migrated to PyMongo async types.

The migration from Motor to PyMongo's AsyncMongoClient and the use of Database/DBClient type aliases align with the broader codebase migration.

Also applies to: 17-17, 26-26, 32-42

Comment on lines +170 to +171
async with self.client.start_session() as session:
async with await session.start_transaction():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the actual code in the file mentioned
head -n 180 backend/app/core/database_context.py | tail -n 20

Repository: HardMax71/Integr8sCode

Length of output: 736


🏁 Script executed:

# Also check the file in real_services.py for comparison
sed -n '160,180p' backend/tests/fixtures/real_services.py

Repository: HardMax71/Integr8sCode

Length of output: 687


🏁 Script executed:

# Search for all session and transaction patterns in the codebase
echo "=== start_session patterns ===" && \
rg "start_session\(\)" --type=py -B2 -A2 && \
echo -e "\n=== start_transaction patterns ===" && \
rg "start_transaction\(\)" --type=py -B2 -A2

Repository: HardMax71/Integr8sCode

Length of output: 1619


🌐 Web query:

PyMongo AsyncMongoClient start_session start_transaction async/await context manager usage

💡 Result:

Short summary and example:

  • Use client.start_session() with await (it returns an async session you should use in an async with to auto-end). Start a transaction on that session with session.start_transaction() inside an async with; start_transaction itself is not awaited. Operations inside the transaction must pass session=session. Exiting the transaction block without exception commits; exiting with exception aborts. [1][2]

Example (Motor / PyMongo async API—both follow same pattern):

from motor.motor_asyncio import AsyncIOMotorClient
client = AsyncIOMotorClient(...)

async def do_tx():
    async with await client.start_session() as session:           # await to initialize session
        async with session.start_transaction():                  # no await on start_transaction()
            await client.db.orders.insert_one({"sku": "abc"}, session=session)
            await client.db.inventory.update_one(
                {"sku": "abc", "qty": {"$gte": 1}},
                {"$inc": {"qty": -1}},
                session=session,
            )
        # leaving the async with commits; exceptions abort.

Notes:

  • Do not reuse the same session concurrently from multiple coroutines. Use one transaction at a time per session. [1][2]

Sources:
[1] PyMongo client_session docs (async sessions & start_transaction).
[2] Motor AsyncIOMotorClient / AsyncIOMotorClientSession examples.


Fix PyMongo async session and transaction context manager usage.

The current code in backend/app/core/database_context.py (lines 170-171) has the async context managers reversed:

async with self.client.start_session() as session:
    async with await session.start_transaction():

According to PyMongo's async API, start_session() must be awaited and start_transaction() must not be awaited:

async with await self.client.start_session() as session:
    async with session.start_transaction():

The pattern used in backend/tests/fixtures/real_services.py is correct and should match this implementation.

🤖 Prompt for AI Agents
In backend/app/core/database_context.py around lines 170 to 171, the async
context managers are used in the wrong order: currently it awaits
session.start_transaction() instead of awaiting client.start_session(). Change
to await the client's start_session() in the outer async with and use
session.start_transaction() (without await) in the inner async with so the
pattern becomes: async with await self.client.start_session() as session: async
with session.start_transaction(): ensuring correct PyMongo async API usage.

Comment on lines +146 to +147
ResourceUsageDomain.from_dict(dict(resource_usage_data))
if resource_usage_data is not None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the specific lines in question
cd backend/app/db/repositories && wc -l execution_repository.py

Repository: HardMax71/Integr8sCode

Length of output: 93


🏁 Script executed:

# Now examine the code around lines 146-147 and line 70
sed -n '140,155p' backend/app/db/repositories/execution_repository.py

Repository: HardMax71/Integr8sCode

Length of output: 814


🏁 Script executed:

# Check line 70 area for comparison
sed -n '65,80p' backend/app/db/repositories/execution_repository.py

Repository: HardMax71/Integr8sCode

Length of output: 862


🏁 Script executed:

# Search for all usages of ResourceUsageDomain.from_dict() to verify the pattern
rg -n -A2 -B2 'ResourceUsageDomain\.from_dict\(' --type=py

Repository: HardMax71/Integr8sCode

Length of output: 5157


🏁 Script executed:

# Find and examine the ResourceUsageDomain class and its from_dict method
rg -n -A10 'def from_dict' --type=py | grep -A10 'ResourceUsageDomain' || find . -name '*.py' -exec grep -l 'class ResourceUsageDomain' {} \;

Repository: HardMax71/Integr8sCode

Length of output: 965


Remove unnecessary dict() coercion for consistency.

The dict() wrapper at line 146 is inconsistent with the identical logic at line 70 in the same file, which calls ResourceUsageDomain.from_dict(resource_usage_data) without coercion. Since resource_usage_data is already a dict from MongoDB, the dict() call only creates an unnecessary shallow copy. This pattern is also inconsistent with other usages throughout the codebase (e.g., backend/app/infrastructure/mappers/sse_mapper.py:24).

Remove the dict() wrapper: ResourceUsageDomain.from_dict(resource_usage_data)

🤖 Prompt for AI Agents
In backend/app/db/repositories/execution_repository.py around lines 146-147,
remove the unnecessary dict() coercion when constructing the domain object:
replace ResourceUsageDomain.from_dict(dict(resource_usage_data)) with
ResourceUsageDomain.from_dict(resource_usage_data) so it matches the identical
usage at line 70 and other codepaths; ensure no other logic depends on the
shallow copy and run tests/linting to confirm consistency.

Comment on lines +71 to +72
indexes_cursor = await self.collection.list_indexes()
existing = await indexes_cursor.to_list(None)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for list_indexes usage in the codebase
rg 'list_indexes\(' --type=py -B 2 -A 2

Repository: HardMax71/Integr8sCode

Length of output: 1883


🌐 Web query:

PyMongo 4.9.2 list_indexes async method documentation

💡 Result:

The async Collection.list_indexes in PyMongo 4.9.2 is an async method that returns an AsyncCommandCursor over index documents. Signature (from the 4.9.2 async API):

async list_indexes(session: Optional[AsyncClientSession] = None, comment: Optional[Any] = None) -> AsyncCommandCursor[MutableMapping[str, Any]]

Usage example:
async for index in await db.test.list_indexes():
print(index)

(Added comment parameter in 4.1; method exists since 3.0.) [1][2]

Sources:

  • PyMongo 4.9.2 async Collection.list_indexes (API docs). [1]
  • PyMongo 4.9.2 release notes / changelog. [2]

The two-step await pattern is correct for PyMongo 4.9.2's async API and is consistently used throughout the codebase (notification_repository.py also follows this pattern).

However, note there is an inconsistency in the test file (test_schema_manager.py) which uses a single-await pattern: await db[coll].list_indexes().to_list(length=None). For consistency, verify the test pattern aligns with the production code pattern.

🤖 Prompt for AI Agents
In backend/app/events/event_store.py around lines 71-72 the two-step await
pattern (await list_indexes() then await to_list()) is used; ensure
test_schema_manager.py uses the same pattern for consistency by replacing the
single-await call to list_indexes().to_list(...) with a two-step await: first
await the list_indexes() cursor, then await its to_list(None) result so the test
matches production code behavior.

Comment on lines +315 to 327
client = AsyncMongoClient(
"mongodb://root:rootpassword@localhost:27017",
serverSelectionTimeoutMS=5000
)
await client.admin.command("ping")
client.close()
await client.close()
except Exception:
print("Starting MongoDB...")
subprocess.run(["docker-compose", "up", "-d", "mongo"], check=False)
await wait_for_service(
lambda: AsyncIOMotorClient("mongodb://root:rootpassword@localhost:27017").admin.command("ping"),
lambda: AsyncMongoClient("mongodb://root:rootpassword@localhost:27017").admin.command("ping"),
service_name="MongoDB"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Verify async operations in service readiness checks.

Line 325 creates a lambda that instantiates AsyncMongoClient and calls admin.command("ping"), but the lambda doesn't use await. The wait_for_service helper (line 299) awaits check_func(), but the lambda itself needs to be async and properly await the ping command.

🔎 Proposed fix
         await wait_for_service(
-            lambda: AsyncMongoClient("mongodb://root:rootpassword@localhost:27017").admin.command("ping"),
+            lambda: _check_mongo_async("mongodb://root:rootpassword@localhost:27017"),
             service_name="MongoDB"
         )
+
+async def _check_mongo_async(url: str):
+    """Helper to check MongoDB connectivity."""
+    client = AsyncMongoClient(url, serverSelectionTimeoutMS=5000)
+    try:
+        await client.admin.command("ping")
+    finally:
+        await client.close()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/tests/fixtures/real_services.py around lines 315 to 327, the lambda
passed to wait_for_service is synchronous and calls
AsyncMongoClient.admin.command("ping") without awaiting; change the check
function to an async def (or an async lambda equivalent) that instantiates
AsyncMongoClient, awaits client.admin.command("ping"), closes the client with
await client.close() (or use an async context manager if available), and pass
that coroutine function to wait_for_service so the helper can await it
correctly.

@sonarqubecloud
Copy link

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/tests/fixtures/real_services.py (1)

337-340: Resource leak: Redis client created but never closed in retry lambda.

Similar to the MongoDB lambda above, this creates a new redis.Redis instance on each retry without closing it, causing a resource leak.

🔎 Proposed fix

Create a helper function that properly closes the Redis client:

+async def _check_redis_async(host: str, port: int):
+    """Helper to check Redis connectivity."""
+    client = redis.Redis(host=host, port=port, socket_connect_timeout=5)
+    try:
+        await client.execute_command("PING")
+    finally:
+        await client.aclose()
+
     # Check Redis
     try:
         r = redis.Redis(host="localhost", port=6379, socket_connect_timeout=5)
         await r.execute_command("PING")
         await r.aclose()
     except Exception:
         print("Starting Redis...")
         subprocess.run(["docker-compose", "up", "-d", "redis"], check=False)
         await wait_for_service(
-            lambda: redis.Redis(host="localhost", port=6379).execute_command("PING"),
+            _check_redis_async("localhost", 6379),
             service_name="Redis"
         )
♻️ Duplicate comments (1)
backend/tests/fixtures/real_services.py (1)

324-327: Resource leak: AsyncMongoClient created but never closed in retry lambda.

The lambda creates a new AsyncMongoClient instance on each retry but never closes it. While the wait loop is in a session-scoped fixture (limited impact), it's still a resource leak.

🔎 Proposed fix

Create a helper function that properly closes the client:

+async def _check_mongo_async(url: str):
+    """Helper to check MongoDB connectivity."""
+    client = AsyncMongoClient(url, serverSelectionTimeoutMS=5000)
+    try:
+        await client.admin.command("ping")
+    finally:
+        await client.close()
+
     # Check MongoDB
     try:
         client = AsyncMongoClient(
             "mongodb://root:rootpassword@localhost:27017",
             serverSelectionTimeoutMS=5000
         )
         await client.admin.command("ping")
         await client.close()
     except Exception:
         print("Starting MongoDB...")
         subprocess.run(["docker-compose", "up", "-d", "mongo"], check=False)
         await wait_for_service(
-            lambda: AsyncMongoClient("mongodb://root:rootpassword@localhost:27017").admin.command("ping"),
+            _check_mongo_async("mongodb://root:rootpassword@localhost:27017"),
             service_name="MongoDB"
         )
🧹 Nitpick comments (2)
.github/workflows/backend-ci.yml (2)

73-112: Consider extracting Docker caching logic to reduce duplication.

This Docker image caching block (lines 73-112) is duplicated in the e2e job (lines 185-224). Since both jobs use identical caching logic, consider extracting it into a composite action or reusable workflow to improve maintainability.

💡 Example: composite action structure

Create .github/actions/cache-docker-images/action.yml:

name: Cache Docker Images
description: Cache and load Docker images for CI
runs:
  using: composite
  steps:
    - name: Cache Docker images
      uses: actions/cache@v5
      id: docker-cache
      with:
        path: /tmp/docker-cache
        key: docker-${{ runner.os }}-${{ env.MONGO_IMAGE }}-${{ env.REDIS_IMAGE }}-${{ env.KAFKA_IMAGE }}-${{ env.SCHEMA_REGISTRY_IMAGE }}
    
    - name: Load cached Docker images
      if: steps.docker-cache.outputs.cache-hit == 'true'
      shell: bash
      run: |
        echo "Loading cached images..."
        for f in /tmp/docker-cache/*.tar.zst; do
          zstd -d -c "$f" | docker load &
        done
        wait
        docker images
    
    - name: Pull and save Docker images
      if: steps.docker-cache.outputs.cache-hit != 'true'
      shell: bash
      run: |
        mkdir -p /tmp/docker-cache
        echo "Pulling images in parallel..."
        docker pull $MONGO_IMAGE &
        docker pull $REDIS_IMAGE &
        docker pull $KAFKA_IMAGE &
        docker pull $SCHEMA_REGISTRY_IMAGE &
        wait
        echo "Saving images with zstd compression..."
        docker save $MONGO_IMAGE | zstd -T0 -3 > /tmp/docker-cache/mongo.tar.zst &
        docker save $REDIS_IMAGE | zstd -T0 -3 > /tmp/docker-cache/redis.tar.zst &
        docker save $KAFKA_IMAGE | zstd -T0 -3 > /tmp/docker-cache/kafka.tar.zst &
        docker save $SCHEMA_REGISTRY_IMAGE | zstd -T0 -3 > /tmp/docker-cache/schema-registry.tar.zst &
        wait
        echo "Cache size:"
        du -sh /tmp/docker-cache/

Then use it in both jobs:

- uses: ./.github/actions/cache-docker-images

243-251: Consider pinning the k3s version for consistency.

The k3s installation uses the latest version from get.k3s.io, which could lead to unexpected behavior changes or flakiness if k3s releases a breaking update. Consider pinning to a specific version for consistency with the Docker image version pinning strategy.

🔎 Example: pin k3s version
-curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable=traefik" sh -
+curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.28.5+k3s1" INSTALL_K3S_EXEC="--disable=traefik" sh -

To find the latest stable version:

curl -s https://api.github.com/repos/k3s-io/k3s/releases/latest | jq -r .tag_name
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5f43ef and 5271c64.

📒 Files selected for processing (4)
  • .github/workflows/backend-ci.yml
  • backend/scripts/seed_users.py
  • backend/tests/fixtures/real_services.py
  • backend/workers/run_saga_orchestrator.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/scripts/seed_users.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/workers/run_saga_orchestrator.py (1)
backend/app/services/event_bus.py (1)
  • close (349-354)
backend/tests/fixtures/real_services.py (2)
backend/tests/conftest.py (4)
  • app (130-138)
  • redis_client (184-186)
  • db (178-180)
  • client (150-161)
backend/app/core/database_context.py (6)
  • db_name (72-74)
  • db_name (152-153)
  • db_name (205-206)
  • client (62-64)
  • client (140-143)
  • client (197-198)
🪛 Checkov (3.2.334)
.github/workflows/backend-ci.yml

[medium] 138-139: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Scan Backend
  • GitHub Check: Scan Frontend
  • GitHub Check: E2E Tests
  • GitHub Check: E2E Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (14)
backend/workers/run_saga_orchestrator.py (2)

114-114: Critical: Properly await async close.

Adding await to db_client.close() is essential for PyMongo's async client. Without it, the database connection may not close properly, leading to resource leaks. This change correctly aligns with async cleanup patterns used elsewhere in the finally block.


20-20: The migration from Motor's AsyncIOMotorClient to PyMongo's native AsyncMongoClient is properly implemented. PyMongo AsyncMongoClient supports both tz_aware and serverSelectionTimeoutMS constructor arguments, and the usage patterns in this file (subscript access for database selection, admin.command() calls, and close() cleanup with await) are all compatible.

backend/tests/fixtures/real_services.py (7)

15-17: LGTM! Import changes align with PyMongo migration.

The imports correctly reference the new PyMongo async client and the Database/DBClient abstractions from the updated database context.


26-26: Type annotation correctly updated to DBClient.

The change from AsyncIOMotorClient to DBClient aligns with the new database abstractions.


32-42: Method correctly migrated to PyMongo async API.

The signature returns Database, uses AsyncMongoClient with appropriate timeouts and connection limits, and properly awaits the ping command.


96-102: Cleanup correctly handles async client closure.

The addition of await self.mongo_client.close() on line 101 properly releases MongoDB client resources.


133-139: Fixture return type correctly updated.

The real_mongodb fixture now returns Database, consistent with the updated connect_mongodb method signature.


161-175: Transaction context manager correctly uses PyMongo async session API.

The parameter type is updated to Database, and the pattern async with await client.start_session() correctly handles PyMongo's async session lifecycle.


194-250: TestDataFactory correctly migrated to Database abstraction.

All methods now accept Database parameters and use the correct PyMongo async collection access pattern (e.g., db.users.insert_one).

.github/workflows/backend-ci.yml (5)

18-23: LGTM! Image version pinning ensures cache consistency.

Pinning the Docker image versions at the workflow level ensures that the cache keys remain stable across runs, which directly supports the CI speed-up objectives.


45-63: PR objective achieved: COVERAGE_CORE set to sysmon.

The unit test configuration correctly implements the PR objective by setting COVERAGE_CORE: sysmon on line 48, which should improve coverage performance. The test isolation (no infrastructure dependencies) and 5-minute timeout are appropriate for unit tests.


126-129: PR objective achieved: curl polling replaced with --wait flag.

Line 128 correctly uses docker compose up -d --wait --wait-timeout 120 to replace manual curl polling, which should improve CI reliability and speed. The 120-second timeout is reasonable for the infrastructure services (MongoDB, Redis, Kafka, Schema Registry).


131-150: LGTM! Integration test configuration is correct.

The test execution correctly:

  • Excludes k8s tests with --ignore=tests/integration/k8s for this non-k8s job
  • Sets COVERAGE_CORE: sysmon (PR objective)
  • Provides appropriate environment variables for all services
  • Uses a reasonable 10-minute timeout

253-282: LGTM! E2E test execution properly configured.

The E2E test execution correctly:

  • Targets only k8s tests (tests/integration/k8s)
  • Sets COVERAGE_CORE: sysmon consistently with other jobs (PR objective)
  • Provides k8s-specific environment variables (KUBECONFIG, K8S_NAMESPACE)
  • Uses distinct coverage flag backend-e2e for proper Codecov segmentation

Comment on lines +81 to +89
- name: Load cached Docker images
if: steps.docker-cache.outputs.cache-hit == 'true'
run: |
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable=traefik --tls-san host.docker.internal" sh -
mkdir -p /home/runner/.kube
sudo k3s kubectl config view --raw > /home/runner/.kube/config
sudo chmod 600 /home/runner/.kube/config
export KUBECONFIG=/home/runner/.kube/config
timeout 90 bash -c 'until sudo k3s kubectl cluster-info; do sleep 5; done'
echo "Loading cached images..."
for f in /tmp/docker-cache/*.tar.zst; do
zstd -d -c "$f" | docker load &
done
wait
docker images
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Test parallel operation error handling

# Test 1: Verify set -e is active
echo "Checking if errexit is enabled..."
if [[ $- == *e* ]]; then
  echo "✓ errexit is enabled (set -e is active)"
else
  echo "✗ Warning: errexit is not enabled"
fi

# Test 2: Simulate a failing parallel operation
echo -e "\nTesting wait with failing background jobs..."
(
  set -e
  false &
  true &
  wait
) && echo "✗ Failed: wait did not propagate error" || echo "✓ Passed: wait correctly propagated error"

# Test 3: Test wait behavior with multiple failures
echo -e "\nTesting wait with multiple failing jobs..."
(
  set -e
  false &
  false &
  wait
) && echo "✗ Failed: wait did not propagate error" || echo "✓ Passed: wait correctly propagated error"

Repository: HardMax71/Integr8sCode

Length of output: 301


🏁 Script executed:

cat -n .github/workflows/backend-ci.yml | head -120

Repository: HardMax71/Integr8sCode

Length of output: 4317


🌐 Web query:

GitHub Actions default shell behavior set -e errexit

💡 Result:

Short answer: GitHub Actions run-steps use "exit on error" by default (the shell is invoked with -e). Specifying shell: bash adds -o pipefail as well; you can disable the default behavior inside a step with set +e or override the shell. [1][2][3]

How to control it (examples):

  • disable exit-on-error for a step:
    • run: |
      set +e
      cmd-that-may-fail
  • force bash with pipefail (ensures pipelines fail early):
    defaults.run.shell: bash
    or per-step:
    shell: bash
    run: |
    ...commands...
  • explicitly set and preserve exit code if needed:
    run: |
    set +e
    cmd
    rc=$?
    echo "rc=$rc" >> $GITHUB_OUTPUT
    exit $rc

References:
[1] StackOverflow: "Does a GitHub action step use set -e semantics by default?" (explains -e default and pipefail behavior).
[2] GitHub Community discussion (confirming -e default and using set +e).
[3] GitHub Docs — defaults.run / setting default shell.


Add explicit error handling for parallel Docker operations.

The wait command only captures the exit status of the last background job, not all of them. If any docker load/pull/save operation fails (except the last), the failure will be masked and the step won't fail. This is confirmed by bash behavior: false & true & wait returns 0 (success) despite the first command failing.

For lines 81-89, check the exit code after wait:

for f in /tmp/docker-cache/*.tar.zst; do
  zstd -d -c "$f" | docker load &
done
wait
[ $? -eq 0 ] || exit 1

Similarly for lines 91-108, add exit code checks after both wait calls, or use explicit per-job error tracking.

Also applies to: 91-108

🤖 Prompt for AI Agents
.github/workflows/backend-ci.yml lines 81-89 and 91-108: the workflow launches
multiple background pipelines (zstd | docker load / docker save / docker pull)
but only uses a bare wait which can mask failures; enable strict failure
detection by turning on pipefail for pipelines and tracking background job PIDs,
then wait on each PID individually and check each wait's exit code, exiting the
step non‑zero if any job failed (apply the same PID tracking + per‑job
wait+check to both blocks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants