-
Notifications
You must be signed in to change notification settings - Fork 60
LCORE-603: Lightspeed core needs to fully support Faiss VectorIO provider #935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- FAISS e2e test (for all inference providers) - Disable shields in all configurations - Dummy rag db
WalkthroughRestructures many example and tests YAML configs to consolidate provider and storage configurations into unified, env-variable-backed provider/backends; adds new vLLM/RHEL/RHOAI examples and E2E FAISS test; removes several legacy example manifests; adds a RAG volume mount to docker-compose files and introduces a new dependency. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (6)
examples/vllm-rhaiis-run.yaml (1)
33-37: Hardcoded port 8000 in URL.The URL hardcodes port 8000:
http://${env.RHAIIS_URL}:8000/v1/unlike the RHEL AI config which uses${env.RHEL_AI_PORT}. Consider using an environment variable for the port to maintain deployment flexibility:url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/tests/e2e/features/faiss.feature (1)
8-27: Consider adding a note about the port difference.This scenario correctly uses
@skip-in-library-modesince it directly queries the llama-stack service on port 8321. The second scenario (lines 29-39) omits the hostname/port override, so it will use defaults to test through lightspeed-stack. This asymmetry is intentional but worth a brief inline comment for maintainability.tests/e2e/configs/run-azure.yaml (1)
34-36: Consider parameterizing the Azure API base URL.The
api_baseis hardcoded to a specific test endpoint. For flexibility across environments, consider using an environment variable with a default:api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}examples/azure-run.yaml (2)
34-36: Consider using an environment variable for the API base URL.The
api_baseis hardcoded whileapi_keyuses an environment variable. For consistency and deployment flexibility across environments, consider parameterizing the base URL:config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ + api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/} api_version: 2024-02-15-preview
111-116: Inconsistent storage path structure between backends.
kv_defaultuses~/.llama/storage/rag/kv_store.db(includes/rag/subfolder) whilesql_defaultuses~/.llama/storage/sql_store.db(no subfolder). This asymmetry may cause confusion. Consider aligning the path structure or documenting the rationale.run.yaml (1)
119-125: Same storage path inconsistency as azure-run.yaml.The
kv_defaultbackend uses~/.llama/storage/rag/kv_store.dbwhilesql_defaultuses~/.llama/storage/sql_store.db. The comment on line 120 explains that registry and RAG are unified inkv_default, which provides context. However, consider whether the/rag/subdirectory name is still appropriate given it stores more than just RAG data.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yaml(1 hunks)docker-compose.yaml(1 hunks)docs/providers.md(2 hunks)examples/azure-run.yaml(1 hunks)examples/gemini-run.yaml(0 hunks)examples/openai-faiss-run.yaml(0 hunks)examples/openai-pgvector-run.yaml(0 hunks)examples/run.yaml(1 hunks)examples/vertexai-run.yaml(4 hunks)examples/vllm-granite-run.yaml(0 hunks)examples/vllm-llama-faiss-run.yaml(0 hunks)examples/vllm-qwen3-run.yaml(0 hunks)examples/vllm-rhaiis-run.yaml(1 hunks)examples/vllm-rhelai-run.yaml(1 hunks)examples/vllm-rhoai-run.yaml(1 hunks)pyproject.toml(1 hunks)run.yaml(3 hunks)tests/e2e-prow/rhoai/configs/run.yaml(4 hunks)tests/e2e/configs/run-azure.yaml(4 hunks)tests/e2e/configs/run-ci.yaml(3 hunks)tests/e2e/configs/run-rhaiis.yaml(4 hunks)tests/e2e/configs/run-rhelai.yaml(4 hunks)tests/e2e/configs/run-vertexai.yaml(4 hunks)tests/e2e/features/faiss.feature(1 hunks)tests/e2e/test_list.txt(1 hunks)
💤 Files with no reviewable changes (6)
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
- examples/vllm-llama-faiss-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-faiss-run.yaml
- examples/openai-pgvector-run.yaml
🧰 Additional context used
📓 Path-based instructions (3)
tests/e2e/test_list.txt
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain test list in
tests/e2e/test_list.txtfor end-to-end tests
Files:
tests/e2e/test_list.txt
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/faiss.feature
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Configure pylint withsource-roots = "src"
Excludesrc/auth/k8s.pyfrom pyright type checking
Files:
pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt` for end-to-end tests
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.
Applied to files:
tests/e2e/test_list.txt
🪛 YAMLlint (1.37.1)
tests/e2e/configs/run-rhaiis.yaml
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
tests/e2e/configs/run-rhelai.yaml
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (30)
pyproject.toml (1)
55-55: Consider using a flexible version constraint for urllib3 or explain the exact version pin.urllib3 2.6.1 is the latest non-vulnerable version, but 2.6.2 is available as a bug fix. The exact version pin (
==2.6.1) is unnecessarily restrictive. If there's no specific compatibility reason for avoiding 2.6.2, consider using>=2.6.1,<2.7instead. If an exact pin is necessary, document the reason in a comment above the dependency.The einops dependency is appropriate: einops 0.8.1 is the latest version with no known vulnerabilities.
⛔ Skipped due to learnings
Learnt from: CR Repo: lightspeed-core/lightspeed-stack PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-11-24T16:58:04.410Z Learning: Always check `pyproject.toml` for existing dependencies and versions before adding new onesdocs/providers.md (2)
103-103: Documentation accurately reflects the disabled llama-guard shield.The change from ✅ to ❌ for llama-guard aligns with the configuration changes across the example files where shields are disabled with the comment about "infinite loop issue with LLM calls".
160-160: rag-runtime is now correctly marked as supported.The rag-runtime provider is properly documented as supported (✅), which aligns with its configuration in all the new example YAML files.
examples/vllm-rhelai-run.yaml (1)
1-151: New RHEL AI configuration file is well-structured.The configuration follows the consolidated provider pattern with:
- Environment-driven paths and credentials
- FAISS vector store with sentence-transformers embeddings
- Properly disabled shields with documented reasoning
- Consistent storage backend configuration
examples/vertexai-run.yaml (2)
30-67: Provider configuration is well-organized.The inference providers are properly configured with:
- Google Vertex AI remote provider with project/location from environment variables
- Sentence-transformers for embeddings
- FAISS vector store with persistence
132-141: Empty models array may need attention.The
registered_resources.modelsarray is empty while the Vertex AI provider specifiesallowed_models: ["google/gemini-2.5-flash"]. Verify if a model registration is needed here, or if theallowed_modelsconfiguration in the provider is sufficient for Vertex AI.examples/vllm-rhoai-run.yaml (2)
134-138: Verify the hardcoded model configuration and null provider_model_id.Unlike other example configs that use environment variables (e.g.,
${env.RHEL_AI_MODEL}), this file hardcodesmodel_id: meta-llama/Llama-3.2-1B-Instructand setsprovider_model_id: null.Consider using environment variables for consistency:
- model_id: ${env.RHOAI_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHOAI_MODEL}
37-37: Different max_tokens value than other vLLM configs.This config uses
max_tokens: 1024whilevllm-rhelai-run.yamlandvllm-rhaiis-run.yamlusemax_tokens: 2048. Verify if this is intentional for RHOAI deployments.examples/vllm-rhaiis-run.yaml (1)
1-151: RHAIIS configuration follows the established pattern.The configuration is well-structured and consistent with other example configurations in this PR.
tests/e2e/configs/run-rhelai.yaml (1)
1-27: Configuration header and store definitions look correct.The version, image_name, APIs, and store configurations follow the established pattern with proper environment variable substitution.
tests/e2e-prow/rhoai/configs/run.yaml (1)
1-150: Configuration structure looks well-organized.The consolidated provider configuration with environment-variable-backed paths is consistent. The explicit warnings about disabled shields (lines 49, 138) document the known issue clearly.
tests/e2e/configs/run-vertexai.yaml (2)
30-67: Provider configuration looks correct.The Vertex AI inference provider is properly configured with project/location environment variables, and the FAISS vector_io provider follows the correct structure with persistence settings.
132-146: Registered resources and vector stores configuration is consistent.Empty
modelslist is appropriate here sinceallowed_modelsis specified in the provider configuration (line 36). Thevector_storesdefaults align with the FAISS provider setup.tests/e2e/configs/run-ci.yaml (2)
53-74: Scoring and vector_io providers are well-configured.The scoring providers include useful documentation comments (e.g., "Enable the RAG tool", "Define the storage backend for RAG"). The placeholder
'********'foropenai_api_keyis appropriate for example configurations.
114-155: Telemetry and storage configuration additions look good.The inline documentation (line 120, 153) explaining the storage backend purpose improves maintainability. The
vector_storesconfiguration with FAISS defaults is consistent with the PR's objective.examples/run.yaml (3)
1-6: Good documentation header.The introductory comments clearly explain the configuration purpose and prerequisites (OpenAI API key, rag-content tool reference). This helps users understand how to use this example.
74-80: FAISS vector_io configuration is correct.The persistence configuration with
namespace: vector_io::faissandbackend: kv_defaultproperly integrates with the storage backends defined below. This aligns with the PR's goal of fully supporting the FAISS VectorIO provider.
147-161: Registered resources and vector stores complete the FAISS integration.The
vector_storessection properly configures:
default_provider_id: faissmatching the vector_io providerdefault_embedding_modelusing sentence-transformers with the nomic modelThis provides a complete working example for FAISS-based RAG.
tests/e2e/test_list.txt (1)
1-1: LGTM!The new
faiss.featureis correctly added to the test list, following the coding guidelines for maintaining end-to-end tests in this file.docker-compose-library.yaml (1)
16-16: LGTM!The volume mount correctly exposes the test RAG database to the container, enabling FAISS-based e2e tests.
docker-compose.yaml (1)
14-14: LGTM!Consistent with the corresponding mount in
docker-compose-library.yaml, ensuring FAISS e2e tests work in both library and non-library modes.tests/e2e/features/faiss.feature (1)
1-39: Feature file follows BDD conventions.The Gherkin structure is correct and scenarios appropriately test FAISS vector store registration and retrieval. As per coding guidelines, this uses the behave framework with proper Given/When/Then steps.
tests/e2e/configs/run-azure.yaml (2)
48-52: Safety providers disabled with documented reason.The commented-out safety configuration includes a clear warning about the infinite loop issue. This is good practice for documenting why a feature is disabled.
61-67: FAISS vector_io provider correctly configured.The FAISS provider configuration with persistence to
kv_defaultbackend aligns with the storage configuration and the mounted RAG volume path.examples/azure-run.yaml (3)
4-14: MissingtelemetryAPI compared torun.yaml.The
run.yamlincludestelemetryin the APIs list and has atelemetry: enabled: truesection, but this Azure configuration omits both. If telemetry should be available in Azure deployments, add it for consistency.
48-52: Shields disabled with documented warning — acceptable for known issue.The safety providers array is empty with a clear warning comment about the infinite loop issue. The corresponding
shields: []in registered_resources is consistent. This is appropriately documented.Also applies to: 138-138
132-150: Registered resources and vector_stores configuration looks correct.The configuration properly:
- Registers the
gpt-4o-minimodel with the Azure provider- Sets up the RAG toolgroup with
rag-runtime- Configures FAISS as the default vector store provider with
nomic-embed-text-v1.5as the embedding modelrun.yaml (3)
31-39: Well-structured inference providers with environment-driven configuration.The OpenAI provider correctly uses environment variables for both the API key and allowed models with sensible defaults. The inline sentence-transformers provider is appropriately configured for embedding support.
68-74: FAISS vector_io provider correctly configured for persistence.The FAISS provider is properly set up with a persistence backend pointing to
kv_default, which aligns with the storage configuration. This enables the RAG functionality as intended by the PR.
148-155: RAG toolgroup and vector_stores configuration complete.The configuration properly:
- Registers
builtin::ragtoolgroup with therag-runtimeprovider- Sets FAISS as the default vector store provider
- Configures
nomic-ai/nomic-embed-text-v1.5as the default embedding model viasentence-transformersThis completes the FAISS VectorIO provider setup required by the PR objectives.
| - provider_id: braintrust | ||
| provider_type: inline::braintrust | ||
| config: | ||
| openai_api_key: '********' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placeholder API key should use an environment variable.
The braintrust provider has a hardcoded placeholder '********' for openai_api_key. This will fail at runtime if braintrust scoring is used. Either use an environment variable pattern consistent with other providers, or document that this provider requires manual configuration.
- provider_id: braintrust
provider_type: inline::braintrust
config:
- openai_api_key: '********'
+ openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: '********' | |
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=} |
🤖 Prompt for AI Agents
In run.yaml around lines 60 to 63, the braintrust provider currently uses a
hardcoded placeholder openai_api_key '********' which will fail at runtime;
replace that value with an environment variable reference (follow the pattern
used by other providers in this repo — e.g. use a typed env var like
${BRAINTRUST_OPENAI_API_KEY} or ${OPENAI_API_KEY}) and update any README/config
docs to state this provider requires that environment variable to be set.
| telemetry: | ||
| enabled: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find run.yaml file
fd -n "run.yaml" --type fRepository: lightspeed-core/lightspeed-stack
Length of output: 306
🏁 Script executed:
#!/bin/bash
# Find run.yaml and check its structure
fd -type f -name "*.yaml" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 246
🏁 Script executed:
find . -name "run.yaml" -type f 2>/dev/null | head -5Repository: lightspeed-core/lightspeed-stack
Length of output: 177
🏁 Script executed:
#!/bin/bash
# Search for run.yaml files
find . -type f -name "run.yaml" 2>/dev/null
# Also search for telemetry configuration patterns
rg -l "telemetry:" --type yaml 2>/dev/null | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 340
🏁 Script executed:
#!/bin/bash
# Locate run.yaml and examine structure around telemetry config
if [ -f "./run.yaml" ]; then
echo "=== Found run.yaml ==="
wc -l ./run.yaml
echo "=== Lines 100-130 ==="
sed -n '100,130p' ./run.yaml
echo "=== Searching for providers section ==="
grep -n "providers:" ./run.yaml || echo "No explicit providers section found"
elif [ -f "run.yaml" ]; then
wc -l run.yaml
sed -n '100,130p' run.yaml
grep -n "providers:" run.yaml || echo "No explicit providers section found"
else
find . -name "run.yaml" -type f 2>/dev/null -exec sh -c 'echo "=== Found: $1 ==="; wc -l "$1"; sed -n "100,130p" "$1"; grep -n "providers:" "$1" || echo "No providers section"' _ {} \;
fiRepository: lightspeed-core/lightspeed-stack
Length of output: 1099
🏁 Script executed:
#!/bin/bash
# Read the providers section
sed -n '1,80p' ./run.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 2155
🏁 Script executed:
#!/bin/bash
# Check entire telemetry section and nearby context
sed -n '110,125p' ./run.yaml
# Search for any backend/exporter configurations
grep -i "backend\|exporter\|endpoint" ./run.yaml | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 1284
🏁 Script executed:
#!/bin/bash
# Check if telemetry is listed in apis and look for telemetry-related documentation
sed -n '1,20p' ./run.yaml
# Search for any telemetry configuration or documentation
rg -i "telemetry" --type yaml -B2 -A2 ./run.yaml | head -40Repository: lightspeed-core/lightspeed-stack
Length of output: 504
🏁 Script executed:
#!/bin/bash
# Search for telemetry-related documentation, READMEs, or code comments
find . -type f \( -name "README*" -o -name "*.md" -o -name "telemetry*" \) | head -20
rg -i "telemetry.*provider|telemetry.*backend|telemetry.*config" --type yaml --type python -B1 -A1 | head -30Repository: lightspeed-core/lightspeed-stack
Length of output: 617
🏁 Script executed:
#!/bin/bash
# Search for telemetry documentation
rg -i "telemetry" docs/ -l 2>/dev/null | head -10
# Also check config documentation
grep -i "telemetry" README.md docs/config.md 2>/dev/nullRepository: lightspeed-core/lightspeed-stack
Length of output: 319
🏁 Script executed:
#!/bin/bash
# Check providers documentation for telemetry config
grep -i "telemetry" docs/providers.md -B2 -A5 | head -40
# Also check deployment guide
grep -i "telemetry" docs/deployment_guide.md -B2 -A5 | head -40Repository: lightspeed-core/lightspeed-stack
Length of output: 2149
🏁 Script executed:
#!/bin/bash
# Get more context from deployment_guide.md about telemetry configuration
sed -n '1,2000p' docs/deployment_guide.md | grep -i "telemetry" -B5 -A10 | head -60Repository: lightspeed-core/lightspeed-stack
Length of output: 2874
🏁 Script executed:
#!/bin/bash
# Check other run.yaml examples to see how telemetry should be configured
echo "=== Checking examples/run.yaml ==="
grep -A5 "telemetry:" examples/run.yaml | head -20
echo ""
echo "=== Checking tests/configuration/run.yaml ==="
grep -A5 "telemetry:" tests/configuration/run.yaml | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 391
🏁 Script executed:
#!/bin/bash
# Check if there are any code comments or documentation about when telemetry provider config is required
rg -i "telemetry.*provider|telemetry.*config|enabled.*true" docs/ -B2 -A2 | head -40
# Also check the main config documentation
head -100 docs/config.mdRepository: lightspeed-core/lightspeed-stack
Length of output: 4089
Add telemetry provider configuration to match test and documentation patterns.
Telemetry is enabled but lacks an explicit provider configuration under the providers section, unlike other APIs (inference, safety, scoring, etc.). The test configuration at tests/configuration/run.yaml shows the proper pattern with providers.telemetry using the meta-reference provider. Either add explicit telemetry provider config or document why using a default backend is acceptable.
🤖 Prompt for AI Agents
In run.yaml around lines 114 to 115, telemetry is enabled but missing an
explicit providers.telemetry entry; add a providers.telemetry block mirroring
the pattern used by inference/safety/scoring (e.g., include a telemetry provider
key such as meta-reference with its required connection/config fields used in
tests) so telemetry has an explicit provider configuration, or update docs to
justify relying on a default backend—ensure the YAML structure follows the
existing providers.<provider> pattern used elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provider config for telemetry is not supported in 0.3.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@are-ces, thank you for the clarification! I've noted that telemetry provider configuration is not supported in version 0.3.x, which explains why the configuration is valid as-is with just the enabled: true flag.
✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| vector_stores: | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector stores configuration uses v1.5, though v2 is now available as a newer alternative.
The FAISS and sentence-transformers configuration is properly set. However, Nomic Embed Text V2 is now available for production workloads as a successor to v1.5, offering multilingual support if needed. Additionally, ensure that when embeddings are generated at runtime, the text prompt includes a task instruction prefix (e.g., search_document: or search_query: for RAG applications) to align with the model's requirements.
🤖 Prompt for AI Agents
In tests/e2e/configs/run-azure.yaml around lines 146 to 150, the vector store
config pins the embedding model to nomic-ai/nomic-embed-text-v1.5; update
model_id to the v2 release (e.g., nomic-ai/nomic-embed-text-v2) to use the newer
multilingual production model and adjust any downstream compatibility settings
if required, and ensure that runtime embedding calls prepend the appropriate
task instruction prefix (for example "search_document:" or "search_query:") to
the text passed to the embedding API so the model receives the required task
context for RAG/search use cases.
| datasetio: | ||
| - provider_id: huggingface | ||
| provider_type: remote::huggingface | ||
| config: | ||
| - config: | ||
| kvstore: | ||
| namespace: datasetio::huggingface | ||
| namespace: huggingface_datasetio | ||
| backend: kv_default | ||
| - provider_id: localfs | ||
| provider_type: inline::localfs | ||
| provider_id: huggingface | ||
| provider_type: remote::huggingface | ||
| config: | ||
| kvstore: | ||
| namespace: datasetio::localfs | ||
| namespace: localfs_datasetio | ||
| backend: kv_default | ||
| scoring: | ||
| - provider_id: basic | ||
| provider_type: inline::basic | ||
| config: {} | ||
| tool_runtime: | ||
| - provider_id: rag-runtime | ||
| provider_type: inline::rag-runtime | ||
| config: {} | ||
| - provider_id: model-context-protocol | ||
| provider_type: remote::model-context-protocol | ||
| config: {} | ||
| batches: | ||
| - provider_id: reference | ||
| provider_type: inline::reference | ||
| provider_id: localfs | ||
| provider_type: inline::localfs | ||
| config: | ||
| kvstore: | ||
| namespace: batches | ||
| namespace: eval_store | ||
| backend: kv_default | ||
| provider_id: meta-reference | ||
| provider_type: inline::meta-reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Missing list item markers cause duplicate key errors.
The datasetio and eval provider entries are missing the - list item markers, causing YAML duplicate key errors. This will break configuration parsing at runtime.
🔎 Apply this diff to fix the YAML structure:
datasetio:
- - config:
+ - config:
kvstore:
namespace: huggingface_datasetio
backend: kv_default
provider_id: huggingface
provider_type: remote::huggingface
+ - config:
+ kvstore:
+ namespace: localfs_datasetio
+ backend: kv_default
+ provider_id: localfs
+ provider_type: inline::localfs
+ eval:
+ - config:
+ kvstore:
+ namespace: eval_store
+ backend: kv_default
+ provider_id: meta-reference
+ provider_type: inline::meta-reference
- config:
- kvstore:
- namespace: localfs_datasetio
- backend: kv_default
- provider_id: localfs
- provider_type: inline::localfs
- config:
- kvstore:
- namespace: eval_store
- backend: kv_default
- provider_id: meta-reference
- provider_type: inline::meta-referenceCompare with the correctly structured examples/run.yaml (lines 99-118) which uses proper list syntax with - markers for each provider entry.
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
🤖 Prompt for AI Agents
In tests/e2e/configs/run-rhaiis.yaml around lines 87 to 105, the datasetio (and
similarly eval) provider entries are written as repeated keys instead of YAML
list items, causing duplicate key parse errors; fix by converting each provider
block into a proper list entry: add a leading '-' before each provider object
(e.g., before each config/provider_id/provider_type group), ensure consistent
indentation for the list items and their child keys to match examples/run.yaml
(lines ~99-118), and validate the file parses as YAML after the change.
| datasetio: | ||
| - provider_id: huggingface | ||
| provider_type: remote::huggingface | ||
| config: | ||
| - config: | ||
| kvstore: | ||
| namespace: datasetio::huggingface | ||
| namespace: huggingface_datasetio | ||
| backend: kv_default | ||
| - provider_id: localfs | ||
| provider_type: inline::localfs | ||
| provider_id: huggingface | ||
| provider_type: remote::huggingface | ||
| config: | ||
| kvstore: | ||
| namespace: datasetio::localfs | ||
| namespace: localfs_datasetio | ||
| backend: kv_default | ||
| scoring: | ||
| - provider_id: basic | ||
| provider_type: inline::basic | ||
| config: {} | ||
| tool_runtime: | ||
| - provider_id: rag-runtime | ||
| provider_type: inline::rag-runtime | ||
| config: {} | ||
| - provider_id: model-context-protocol | ||
| provider_type: remote::model-context-protocol | ||
| config: {} | ||
| batches: | ||
| - provider_id: reference | ||
| provider_type: inline::reference | ||
| provider_id: localfs | ||
| provider_type: inline::localfs | ||
| config: | ||
| kvstore: | ||
| namespace: batches | ||
| namespace: eval_store | ||
| backend: kv_default | ||
| provider_id: meta-reference | ||
| provider_type: inline::meta-reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: YAML structure is malformed with duplicate keys.
The datasetio and eval sections have broken YAML structure due to missing list item markers (-). Static analysis correctly identified duplicate keys (config, provider_id, provider_type) which indicates the second datasetio provider and the eval provider are incorrectly merged.
🔎 Apply this diff to fix the YAML structure:
datasetio:
- - config:
- kvstore:
- namespace: huggingface_datasetio
- backend: kv_default
- provider_id: huggingface
- provider_type: remote::huggingface
+ - config:
+ kvstore:
+ namespace: huggingface_datasetio
+ backend: kv_default
+ provider_id: huggingface
+ provider_type: remote::huggingface
+ - config:
+ kvstore:
+ namespace: localfs_datasetio
+ backend: kv_default
+ provider_id: localfs
+ provider_type: inline::localfs
+ eval:
+ - config:
+ kvstore:
+ namespace: eval_store
+ backend: kv_default
+ provider_id: meta-reference
+ provider_type: inline::meta-reference
- config:
- kvstore:
- namespace: localfs_datasetio
- backend: kv_default
- provider_id: localfs
- provider_type: inline::localfs
- config:
- kvstore:
- namespace: eval_store
- backend: kv_default
- provider_id: meta-reference
- provider_type: inline::meta-referenceCompare with the correct structure in examples/vllm-rhelai-run.yaml (lines 87-106) for reference.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| datasetio: | |
| - provider_id: huggingface | |
| provider_type: remote::huggingface | |
| config: | |
| - config: | |
| kvstore: | |
| namespace: datasetio::huggingface | |
| namespace: huggingface_datasetio | |
| backend: kv_default | |
| - provider_id: localfs | |
| provider_type: inline::localfs | |
| provider_id: huggingface | |
| provider_type: remote::huggingface | |
| config: | |
| kvstore: | |
| namespace: datasetio::localfs | |
| namespace: localfs_datasetio | |
| backend: kv_default | |
| scoring: | |
| - provider_id: basic | |
| provider_type: inline::basic | |
| config: {} | |
| tool_runtime: | |
| - provider_id: rag-runtime | |
| provider_type: inline::rag-runtime | |
| config: {} | |
| - provider_id: model-context-protocol | |
| provider_type: remote::model-context-protocol | |
| config: {} | |
| batches: | |
| - provider_id: reference | |
| provider_type: inline::reference | |
| provider_id: localfs | |
| provider_type: inline::localfs | |
| config: | |
| kvstore: | |
| namespace: batches | |
| namespace: eval_store | |
| backend: kv_default | |
| provider_id: meta-reference | |
| provider_type: inline::meta-reference | |
| datasetio: | |
| - config: | |
| kvstore: | |
| namespace: huggingface_datasetio | |
| backend: kv_default | |
| provider_id: huggingface | |
| provider_type: remote::huggingface | |
| - config: | |
| kvstore: | |
| namespace: localfs_datasetio | |
| backend: kv_default | |
| provider_id: localfs | |
| provider_type: inline::localfs | |
| eval: | |
| - config: | |
| kvstore: | |
| namespace: eval_store | |
| backend: kv_default | |
| provider_id: meta-reference | |
| provider_type: inline::meta-reference |
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
🤖 Prompt for AI Agents
In tests/e2e/configs/run-rhelai.yaml around lines 87 to 105, the YAML list items
under datasetio (and the eval section) are missing leading '-' markers, causing
duplicate mapping keys (config, provider_id, provider_type) to be merged
incorrectly; fix by converting each provider block into a proper list entry
(prepend '-' to each provider object under datasetio and ensure eval is a
separate mapping, not merged), align indentation so each entry contains its own
config/kvstore block, provider_id, and provider_type, and verify the final
structure matches examples/vllm-rhelai-run.yaml lines 87-106.
| And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva | ||
| When I use "query" to ask question with authorization header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/e2e/features/faiss.feature | sed -n '30,35p'Repository: lightspeed-core/lightspeed-stack
Length of output: 542
Complete the JWT token with all three segments.
A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots. The token on line 31 is missing the signature segment and the payload appears truncated. Provide a complete, valid JWT token with all three parts (header.payload.signature).
🤖 Prompt for AI Agents
In tests/e2e/features/faiss.feature around lines 31 to 32, the Authorization
Bearer token is incomplete (missing the third signature segment and has a
truncated payload); replace that string with a well-formed JWT consisting of
three Base64url-encoded segments separated by dots (header.payload.signature).
Ensure the token used in the test is complete and valid for your test
environment (either a real signed token or a locally generated test token with
matching server validation), and update the line to include the full token
string.
160b644 to
ca83927
Compare
Description
FAISSrag (making sure the retrieved chunks are from the docs in the db).pyproject.toml)docs/providers, revoked support for shields as they are unusable now.examplesfolder, removed old unsupported configs (lls0.2.xformat).Test:
Ask the agent "What is the title of the article from Paul?", expected answer: "great work"
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
NA
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.