LCORE-332: Lightspeed core needs to fully support WatsonX LLM provider #943

are-ces · 2025-12-22T09:19:34Z

Description

Added e2e tests for WatsonX provider
Updated documentation
updated e2e tests for the issue mentioned below
updated query.py for the issue mentioned below

When calling query without explicitly specifying a model or provider, query.py in LCS checks whether the model is registered. In this case, LCS fails to find the WatsonX model. This happens because WatsonX model identifiers are not built in the expected <provider_id>/<model_id> format. WatsonX is one of the few providers that registers a model_id containing a /, for example: meta-llama/llama-4-maverick-17b-128e-instruct-fp8, which in llama-stack is stored as the model.identifier, while LCS expects it to be watsonx/meta-llama/llama-4-maverick-17b-128e-instruct-fp8.

As a workaround, I propose using environment variables to override the model and provider used in end-to-end tests (E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE). Additionally, the e2e tests should explicitly specify the model and provider instead of sending only the query field.

We can keep the existing tests in query.feature and streaming.feature that verify calling the endpoint without specifying a provider or model. These tests will continue to fail for WatsonX until the issue is fixed upstream.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)
NA

Related Tickets & Documents

Related Issue # LCORE-332
Closes # LCORE-332

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added WatsonX as a supported provider for AI inference capabilities.
- Introduced WatsonX environment configuration and example runtime setup.
Documentation
- Updated provider documentation and README to include WatsonX support.
Tests
- Extended E2E test matrix to include WatsonX environment testing.
- Updated test scenarios to support explicit model and provider parameters.
- Added environment variable overrides for test configuration flexibility.
Improvements
- Enhanced model matching logic for better identifier handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

added override model and provider to e2e tests patched conversation tests to include model and provider in the call

coderabbitai · 2025-12-22T09:19:44Z

Walkthrough

This PR adds comprehensive WatsonX provider support to the Llama Stack project, including CI/CD workflow configuration with watsonx environment matrix entry, docker-compose environment variable additions, new example and E2E test configurations, documentation updates, and E2E test enhancements to support explicit model/provider parameters with environment override capability.

Changes

Cohort / File(s)	Summary
CI/CD & Workflow `.github/workflows/e2e_tests.yaml`	Adds watsonx as an additional environment to the E2E test matrix with WatsonX credentials (WATSONX_PROJECT_ID, WATSONX_API_KEY) and introduces a test override step for watsonx environment to set default model/provider values.
Container & Infrastructure `docker-compose.yaml`, `docker-compose-library.yaml`	Adds three WatsonX-related environment variables (WATSONX_BASE_URL, WATSONX_PROJECT_ID, WATSONX_API_KEY) to both services using default-empty pattern consistent with other providers.
Documentation `README.md`, `docs/providers.md`	Adds WatsonX provider entry to prerequisites/provider documentation and updates the Inference Providers table with watsonx mapped to litellm implementation.
Configuration & Examples `examples/watsonx-run.yaml`, `tests/e2e/configs/run-watsonx.yaml`	Introduces new full Llama runtime configuration files with watsonx provider setup, including API groups, storage backends, inference/metadata stores, vector stores, and tool definitions.
Core Logic `src/app/endpoints/query.py`	Consolidates ToolExecutionStep import and broadens model matching logic to accept either full llama_stack_model_id or plain model_id from available models.
E2E Test Environment `tests/e2e/features/environment.py`	Adds support for environment variable overrides (E2E_DEFAULT_MODEL_OVERRIDE, E2E_DEFAULT_PROVIDER_OVERRIDE) to set default LLM model/provider before fetching from service, with fallback to development defaults.
E2E Test Features `tests/e2e/features/conversations.feature`, `tests/e2e/features/query.feature`, `tests/e2e/features/streaming_query.feature`	Extends all relevant test JSON payloads to include "model" and "provider" fields alongside existing query parameters across multiple scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Logic changes in query.py: Model matching logic broadened to accept dual identifiers—verify this maintains backward compatibility and doesn't introduce unintended matches.
Environment override mechanism in environment.py: New conditional logic for override precedence (overrides → service fetch → fallback)—ensure precedence order and fallback behavior align with intended behavior.
Test payload consistency: Verify all E2E test scenarios consistently apply model/provider fields across conversations, query, and streaming_query features.
WatsonX configuration: Ensure watsonx configuration in docker-compose files and E2E configs is complete, consistent, and properly uses environment variable placeholders.

Possibly related PRs

LCORE-166: add e2e tests for the endpoints #425: Modifies the same E2E feature files (conversations.feature, query.feature, streaming_query.feature) with model/provider field additions.
LCORE-760: fix hardcoded model and provider in e2e tests #680: Updates E2E test environment handling and feature files to support dynamic model/provider overrides and placeholder substitution.
LCORE-792: Lightspeed core needs to fully support VertexAI LLM provider #924: Adds support for a new LLM provider with parallel CI workflow and docker-compose configuration changes.

Suggested reviewers

tisnik
radofuchs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly identifies the main objective: adding full WatsonX LLM provider support to Lightspeed core, which aligns with the core changes throughout the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

are-ces · 2025-12-22T09:22:24Z

PS: Please ignore issues with VertexAI provider as they are independent of these changes; the project has been migrated by the cloud team, I will need to talk to them to re-enable it.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

.github/workflows/e2e_tests.yaml (1)
263-268: Document the watsonx override as temporary technical debt.

The override E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instruct duplicates the provider name, which is a workaround for WatsonX's model identifier format containing /. Per the PR description, this is intentional until an upstream fix is applied.

Consider tracking this as technical debt with a TODO comment or issue reference to ensure it's revisited when the upstream fix is available.
🔎 Suggested improvement
 # watsonx has a different convention than "<provider>/<model>"
+# TODO(LCORE-XXX): Remove this workaround once Llama Stack properly handles WatsonX model identifiers with embedded slashes
 - name: Set watsonx test overrides
   if: matrix.environment == 'watsonx'
   run: |
     echo "E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instruct" >> $GITHUB_ENV
     echo "E2E_DEFAULT_PROVIDER_OVERRIDE=watsonx" >> $GITHUB_ENV
tests/e2e/features/environment.py (1)
61-86: Consider validating partial override configuration.

The current logic requires both E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE to be set (line 65). If only one is provided, both are silently ignored and the code falls back to service detection. This could be confusing for users who expect a partial override to work.

Consider adding a warning when only one override is set, or raising an error if they should be used together.
🔎 Proposed validation for partial overrides
 # Check for environment variable overrides first
 model_override = os.getenv("E2E_DEFAULT_MODEL_OVERRIDE")
 provider_override = os.getenv("E2E_DEFAULT_PROVIDER_OVERRIDE")
 
+# Validate that overrides are provided together
+if bool(model_override) != bool(provider_override):
+    print(
+        "⚠ Warning: Both E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE "
+        "must be set together. Falling back to service detection."
+    )
+
 if model_override and provider_override:
     context.default_model = model_override
     context.default_provider = provider_override
     print(
         f"Using override LLM: {context.default_model} (provider: {context.default_provider})"
     )
examples/watsonx-run.yaml (2)

65-65: Clarify the hardcoded asterisks for openai_api_key.

The openai_api_key is set to a literal string of asterisks '********'. If the braintrust scoring provider is intended to be functional, this will fail authentication. If braintrust is disabled or not used in this example, consider removing this provider entry or adding a comment explaining it's a placeholder.

50-50: Track the workaround for disabled safety shields.

Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This is a significant security/safety feature being disabled.

Consider creating a tracking issue for this known limitation so it can be re-enabled once the upstream issue is resolved.

Would you like me to help create a GitHub issue to track this limitation?

Also applies to: 149-149

tests/e2e/configs/run-watsonx.yaml (2)

65-65: Clarify the hardcoded asterisks for openai_api_key.

The openai_api_key is set to a literal string of asterisks '********'. If the braintrust scoring provider is needed for E2E tests, this will fail authentication. If braintrust is not used in these tests, consider removing this provider entry or documenting that it's intentionally disabled.

50-50: Document the workaround for disabled safety shields in E2E tests.

Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This affects the test coverage for safety features.

Consider adding a comment or documentation about which safety-related test scenarios are skipped due to this limitation.

Also applies to: 149-149

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd795b3 and 9dbf66b.

📒 Files selected for processing (12)

.github/workflows/e2e_tests.yaml
README.md
docker-compose-library.yaml
docker-compose.yaml
docs/providers.md
examples/watsonx-run.yaml
src/app/endpoints/query.py
tests/e2e/configs/run-watsonx.yaml
tests/e2e/features/conversations.feature
tests/e2e/features/environment.py
tests/e2e/features/query.feature
tests/e2e/features/streaming_query.feature

🧰 Additional context used

📓 Path-based instructions (5)

tests/e2e/features/**/*.feature