[RFE] Simplify startup: Make "library mode" for llama-stack middleware the default mode

**Is your feature request related to a problem? Please describe.**

Currently, starting Lightspeed Stack requires users to manually manage two separate processes:

  1. Start Llama Stack server: `export OPENAI_API_KEY=<key> && uv run llama stack run run.yaml`
  2. Start Lightspeed Stack: `make run`

This creates several issues:

   - **Excessive documentation focus on middleware**: Getting started guides, architecture diagrams, and tutorials spend significant time explaining Llama Stack configuration and startup, even though it's middleware that should be abstracted away
  -  **High cognitive load**: Users must understand and manage the middleware layer explicitly
  -  **Tight Coupling**: The current architecture tightly couples the user experience to a specific middleware
  implementation (Llama Stack)
  - **Future flexibility**: Switching middleware in the future would require significant user-facing changes
   
**Describe the solution you'd like**

Single-command startup that abstracts middleware management:

  `OPENAI_API_KEY=<key> make run`

  This command should:
  1.  Automatically start Llama Stack in the background (if needed)
  2.  Start Lightspeed Stack main service
  3.  Handle graceful shutdown of both processes
  4.  Provide unified logging/status output
  5.  Validate configuration before starting

 #### Expected Benefits

  - **Simplified documentation**: Focus on Lightspeed Stack features, not middleware setup
  - **Better abstraction**: Middleware becomes an implementation detail
  - **Future-proof architecture**: Switching middleware doesn't change user-facing startup process
  

**Describe alternatives you've considered**

#### Option 1: Python Subprocess Management

  Create a launcher script (`scripts/start.py`) that:
  - Validates configuration
  - Starts Llama Stack as subprocess
  - Monitors health endpoints
  - Starts Lightspeed Stack
  - Handles SIGTERM/SIGINT for graceful shutdown
  
#### Option 2: Process Management in Makefile

  run:
  	@echo "Starting Lightspeed Stack..."
  	@# Start Llama Stack in background
  	@uv run llama stack run run.yaml > .llama_stack.log 2>&1 & echo $$! > .llama_stack.pid
  	@sleep 2  # Wait for Llama Stack to be ready
  	@# Start Lightspeed Stack
  	@uv run src/lightspeed_stack.py
  	@# Cleanup on exit
  	@kill `cat .llama_stack.pid` 2>/dev/null || true

#### Option 3: Use Existing Container Orchestration

Leverage the existing docker-compose.yaml for local development:
  `make run`  # calls: `podman compose up`



**Acceptance Criteria**

  - Users can start both services with single command: OPENAI_API_KEY=<key> make run
  - Llama Stack automatically starts in the background (when using service mode)
  - Both services shut down gracefully with Ctrl+C
  - Error messages are clear if configuration is invalid
  - Documentation updated to reflect simplified startup
  - Library mode (embedded Llama Stack) continues to work as-is
  - Existing make run behavior smoothly migrated
  
**Additional context**

#### Architecture Philosophy

Middleware should be invisible: Users should focus on Lightspeed Stack capabilities (queries, RAG, agents, safety), not on how the middleware layer is implemented. The abstraction should be clean enough that switching from Llama Stack to another middleware in the future requires minimal user-facing changes.

This aligns with the following software architecture principles:
  - Separation of concerns: Application layer vs middleware layer
  - Single Responsibility: Users manage one service (Lightspeed Stack), not two
  - Open/Closed Principle: Open for middleware extension, closed for modification of user experience


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFE] Simplify startup: Make "library mode" for llama-stack middleware the default mode #778

Expected Benefits

Option 1: Python Subprocess Management

Option 2: Process Management in Makefile

Option 3: Use Existing Container Orchestration

Architecture Philosophy

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFE] Simplify startup: Make "library mode" for llama-stack middleware the default mode #778

Description

Expected Benefits

Option 1: Python Subprocess Management

Option 2: Process Management in Makefile

Option 3: Use Existing Container Orchestration

Architecture Philosophy

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions