-
Notifications
You must be signed in to change notification settings - Fork 60
Description
Is your feature request related to a problem? Please describe.
Currently, starting Lightspeed Stack requires users to manually manage two separate processes:
- Start Llama Stack server:
export OPENAI_API_KEY=<key> && uv run llama stack run run.yaml - Start Lightspeed Stack:
make run
This creates several issues:
- Excessive documentation focus on middleware: Getting started guides, architecture diagrams, and tutorials spend significant time explaining Llama Stack configuration and startup, even though it's middleware that should be abstracted away
- High cognitive load: Users must understand and manage the middleware layer explicitly
- Tight Coupling: The current architecture tightly couples the user experience to a specific middleware
implementation (Llama Stack) - Future flexibility: Switching middleware in the future would require significant user-facing changes
Describe the solution you'd like
Single-command startup that abstracts middleware management:
OPENAI_API_KEY=<key> make run
This command should:
- Automatically start Llama Stack in the background (if needed)
- Start Lightspeed Stack main service
- Handle graceful shutdown of both processes
- Provide unified logging/status output
- Validate configuration before starting
Expected Benefits
- Simplified documentation: Focus on Lightspeed Stack features, not middleware setup
- Better abstraction: Middleware becomes an implementation detail
- Future-proof architecture: Switching middleware doesn't change user-facing startup process
Describe alternatives you've considered
Option 1: Python Subprocess Management
Create a launcher script (scripts/start.py) that:
- Validates configuration
- Starts Llama Stack as subprocess
- Monitors health endpoints
- Starts Lightspeed Stack
- Handles SIGTERM/SIGINT for graceful shutdown
Option 2: Process Management in Makefile
run:
@echo "Starting Lightspeed Stack..."
@# Start Llama Stack in background
@UV run llama stack run run.yaml > .llama_stack.log 2>&1 & echo $$! > .llama_stack.pid
@sleep 2 # Wait for Llama Stack to be ready
@# Start Lightspeed Stack
@UV run src/lightspeed_stack.py
@# Cleanup on exit
@Kill cat .llama_stack.pid 2>/dev/null || true
Option 3: Use Existing Container Orchestration
Leverage the existing docker-compose.yaml for local development:
make run # calls: podman compose up
Acceptance Criteria
- Users can start both services with single command: OPENAI_API_KEY= make run
- Llama Stack automatically starts in the background (when using service mode)
- Both services shut down gracefully with Ctrl+C
- Error messages are clear if configuration is invalid
- Documentation updated to reflect simplified startup
- Library mode (embedded Llama Stack) continues to work as-is
- Existing make run behavior smoothly migrated
Additional context
Architecture Philosophy
Middleware should be invisible: Users should focus on Lightspeed Stack capabilities (queries, RAG, agents, safety), not on how the middleware layer is implemented. The abstraction should be clean enough that switching from Llama Stack to another middleware in the future requires minimal user-facing changes.
This aligns with the following software architecture principles:
- Separation of concerns: Application layer vs middleware layer
- Single Responsibility: Users manage one service (Lightspeed Stack), not two
- Open/Closed Principle: Open for middleware extension, closed for modification of user experience