Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 8% (0.08x) speedup for _model_custom_llm_provider_matches_wildcard_pattern in litellm/proxy/auth/auth_checks.py

⏱️ Runtime : 23.7 milliseconds 22.1 milliseconds (best of 88 runs)

📝 Explanation and details

The optimized code achieves a 7% speedup by reducing redundant string operations and improving dictionary-based lookups, particularly beneficial for the authentication pattern matching hot path.

Key Optimizations

1. Eliminated Redundant String Splitting

  • Cached model.split("/", 1) result in split_model variable, reusing it throughout the function
  • Extracted model_prefix = split_model[0] to avoid repeated indexing
  • Pre-computed num_parts = len(split_model) for length checks
  • This reduces multiple expensive string split operations from ~6 calls to 1 call per invocation

2. Pre-computed Provider Membership Checks

  • Cached model_prefix in litellm.provider_list as model_prefix_in_provider
  • Cached model_prefix not in litellm.model_list_set as model_prefix_not_in_model_set
  • These boolean results are reused in subsequent conditionals, avoiding repeated set membership tests

3. Optimized API Base Endpoint Matching

  • Replaced sequential if/elif chain with a dictionary mapping (endpoint_map) for most static endpoints
  • Dictionary lookups are O(1) vs O(n) sequential string comparisons
  • Preserved complex logic for endpoints requiring special handling (like codestral with dual providers)

Performance Impact

The optimizations particularly benefit the authentication hot path where get_llm_provider is called frequently through _model_custom_llm_provider_matches_wildcard_pattern. Based on the function reference, this is used in _model_matches_any_wildcard_pattern_in_list for validating model patterns against allowed lists - a critical security check that runs on every API request.

Test Results Show Consistent Gains:

  • Basic pattern matching: 3-8% faster
  • Large-scale operations (500+ models): 5-22% faster
  • Complex wildcard patterns: 6% faster

The optimization is especially effective for workloads with frequent model validation, model provider lookups, or batch processing scenarios where the same function is called repeatedly with different models.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 4115 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import re

# imports
import pytest
from litellm.proxy.auth.auth_checks import \
    _model_custom_llm_provider_matches_wildcard_pattern


# Minimal stubs/mocks for litellm and its constants/models, since we cannot import the real ones in this test context.
# These are necessary for the function to work and for the tests to be meaningful.
class DummyAnthropicTextConfig:
    @staticmethod
    def _is_anthropic_text_model(model: str) -> bool:
        return model in {"claude-2", "claude-instant-1"}

class DummyLiteLLMProxyChatConfig:
    @staticmethod
    def _should_use_litellm_proxy_by_default(litellm_params=None):
        return False

    @staticmethod
    def litellm_proxy_get_custom_llm_provider_info(model, api_base=None, api_key=None):
        return model, "litellm_proxy", api_key, api_base

class DummyNscaleConfig:
    API_BASE_URL = "https://inference.api.nscale.com/v1"
    @staticmethod
    def get_api_key(api_key=None):
        return api_key or "dummy_nscale_api_key"

class DummyExceptions:
    class BadRequestError(Exception):
        def __init__(self, message, model, response, llm_provider):
            super().__init__(message)

# Dummy litellm module with required attributes
class DummyLitellm:
    # Provider and model lists
    provider_list = {"openai", "anthropic", "cohere", "mistral", "bedrock"}
    model_list_set = set()
    open_ai_chat_completion_models = {"gpt-4o", "gpt-3.5-turbo"}
    open_ai_text_completion_models = {"text-davinci-003"}
    openai_image_generation_models = {"dall-e-3"}
    open_ai_embedding_models = {"text-embedding-ada-002"}
    anthropic_models = {"claude-2", "claude-instant-1", "claude-3-5-sonnet-20240620"}
    cohere_models = {"command", "command-light"}
    cohere_chat_models = {"command-r", "command-r-plus"}
    cohere_embedding_models = {"embed-english-v2.0"}
    replicate_models = {"meta/llama-2-70b-chat:abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890"}
    openrouter_models = {"openrouter/llama-2-70b"}
    maritalk_models = {"maritalk/mt-model"}
    vertex_chat_models = {"gemini-1.5-pro"}
    vertex_code_chat_models = set()
    vertex_text_models = set()
    vertex_code_text_models = set()
    vertex_language_models = set()
    vertex_embedding_models = set()
    vertex_vision_models = set()
    vertex_ai_image_models = set()
    vertex_ai_video_models = set()
    ai21_chat_models = {"j2-ultra"}
    ai21_models = set()
    aleph_alpha_models = {"luminous-base"}
    baseten_models = {"baseten/llama-2"}
    nlp_cloud_models = {"nlpcloud/gpu-model"}
    petals_models = {"petals/model"}
    bedrock_models = {"anthropic.claude-3-5-sonnet-20240620"}
    bedrock_embedding_models = set()
    bedrock_converse_models = set()
    watsonx_models = {"watsonx/model"}
    empower_models = {"empower/model"}
    gradient_ai_models = {"gradient/model"}
    suppress_debug_info = True
    REPLICATE_MODEL_NAME_WITH_ID_LENGTH = 64
    openai_compatible_endpoints = [
        "api.perplexity.ai", "api.endpoints.anyscale.com/v1", "api.deepinfra.com/v1/openai",
        "api.mistral.ai/v1", "api.groq.com/openai/v1", "https://integrate.api.nvidia.com/v1",
        "https://api.cerebras.ai/v1", "https://inference.baseten.co/v1",
        "https://api.sambanova.ai/v1", "https://api.ai21.com/studio/v1",
        "https://codestral.mistral.ai/v1", "app.empower.dev/api/v1",
        "api.deepseek.com/v1", "https://api.friendli.ai/serverless/v1",
        "api.galadriel.com/v1", "https://api.llama.com/compat/v1",
        "https://api.featherless.ai/v1", DummyNscaleConfig.API_BASE_URL,
        "dashscope-intl.aliyuncs.com/compatible-mode/v1", "api.moonshot.ai/v1",
        "https://api.v0.dev/v1", "https://api.lambda.ai/v1", "https://api.hyperbolic.xyz/v1",
        "https://ai-gateway.vercel.sh/v1", "https://api.inference.wandb.ai/v1"
    ]
    NscaleConfig = DummyNscaleConfig
    AnthropicTextConfig = DummyAnthropicTextConfig
    LiteLLMProxyChatConfig = DummyLiteLLMProxyChatConfig
    exceptions = DummyExceptions

litellm = DummyLitellm
from litellm.proxy.auth.auth_checks import \
    _model_custom_llm_provider_matches_wildcard_pattern

# ----------- UNIT TESTS ------------

# 1. Basic Test Cases
@pytest.mark.parametrize("model, pattern, expected", [
    # OpenAI model with openai/* pattern
    ("gpt-4o", "openai/*", True),
    # Anthropic model with anthropic/* pattern
    ("claude-3-5-sonnet-20240620", "anthropic/*", True),
    # Cohere chat model with cohere_chat/* pattern
    ("command-r", "cohere_chat/*", True),
    # Bedrock model with bedrock/* pattern
    ("anthropic.claude-3-5-sonnet-20240620", "bedrock/*", True),
    # Model with exact match pattern
    ("gpt-4o", "openai/gpt-4o", True),
    # Model with mismatched pattern
    ("gpt-4o", "anthropic/*", False),
    # Model with wildcard only pattern
    ("gpt-4o", "*", False),  # Because provider/model = openai/gpt-4o, pattern = *, no wildcard in provider
    # Model with provider/model pattern, but model doesn't match
    ("gpt-4o", "openai/claude-2", False),
])
def test_basic_cases(model, pattern, expected):
    """Basic functionality tests for wildcard matching."""
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern); result = codeflash_output # 264μs -> 256μs (3.17% faster)

# 2. Edge Test Cases
@pytest.mark.parametrize("model, pattern, expected", [
    # Model not in any known provider/model list
    ("unknown-model", "openai/*", False),
    # Model with empty string
    ("", "openai/*", False),
    # Pattern with empty string
    ("gpt-4o", "", False),
    # Pattern with only wildcard
    ("gpt-4o", "*", False),
    # Model with provider prefix and wildcard pattern
    ("openai/gpt-4o", "openai/*", True),
    # Model with provider prefix and mismatched pattern
    ("openai/gpt-4o", "anthropic/*", False),
    # Model with provider prefix and exact match
    ("openai/gpt-4o", "openai/gpt-4o", True),
    # Model with provider prefix and pattern with trailing slash
    ("openai/gpt-4o", "openai/", False),
    # Model with provider prefix and pattern with double wildcard
    ("openai/gpt-4o", "openai/*/*", False),
    # Model with provider prefix and pattern with star in provider
    ("openai/gpt-4o", "*/gpt-4o", False),
])
def test_edge_cases(model, pattern, expected):
    """Edge case tests for wildcard matching."""
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern); result = codeflash_output # 415μs -> 406μs (2.37% faster)

# 3. Large Scale Test Cases
def test_large_scale_many_models():
    """Test scalability with many models and patterns."""
    # Generate 500 models of form openai/model-{i}
    models = [f"openai/model-{i}" for i in range(500)]
    # All should match openai/* pattern
    for model in models:
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, "openai/*") # 2.40ms -> 2.26ms (6.05% faster)

def test_large_scale_many_patterns():
    """Test scalability with many patterns."""
    # Model is always gpt-4o, patterns vary
    patterns = [f"openai/gpt-4o{i}" for i in range(500)]
    # Only exact match should succeed
    for i, pattern in enumerate(patterns):
        expected = (pattern == "openai/gpt-4o")
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern("gpt-4o", pattern); result = codeflash_output # 2.03ms -> 1.66ms (22.3% faster)
        if expected:
            pass
        else:
            pass

def test_large_scale_wildcard_patterns():
    """Test scalability with many wildcard patterns."""
    # Model is openai/model-{i}, patterns are openai/model-*
    for i in range(500):
        model = f"openai/model-{i}"
        pattern = "openai/model-*"
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.38ms -> 2.24ms (6.00% faster)

def test_large_scale_non_matching():
    """Test scalability with many non-matching models."""
    # Model is anthropic/model-{i}, pattern is openai/*
    for i in range(500):
        model = f"anthropic/model-{i}"
        pattern = "openai/*"
        codeflash_output = not _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.45ms -> 2.33ms (5.25% faster)

# 4. Additional Edge Cases
def test_model_with_slash_in_name():
    """Test model name containing slashes."""
    model = "openai/gpt-4o/extra"
    # Should match openai/* pattern if provider extraction works
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, "openai/*") # 15.5μs -> 14.9μs (3.81% faster)

def test_pattern_with_multiple_wildcards():
    """Test pattern with multiple wildcards."""
    model = "gpt-4o"
    pattern = "openai/gpt-*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 13.0μs -> 12.3μs (6.29% faster)

def test_model_provider_with_number():
    """Test provider with number in name."""
    # Add provider to dummy litellm for this test
    litellm.provider_list.add("provider2")
    model = "provider2/model-x"
    pattern = "provider2/*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 171μs -> 166μs (3.14% faster)
    litellm.provider_list.remove("provider2")  # Clean up

def test_model_and_pattern_are_identical():
    """Test model and pattern are exactly the same."""
    model = "gpt-4o"
    pattern = "openai/gpt-4o"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 10.7μs -> 9.90μs (8.58% faster)

def test_model_with_no_provider_and_wildcard_pattern():
    """Test model with no provider and wildcard pattern."""
    model = "gpt-4o"
    pattern = "*"
    codeflash_output = not _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 12.9μs -> 12.4μs (3.87% faster)

def test_model_with_provider_and_star_pattern():
    """Test model with provider and star pattern."""
    model = "openai/gpt-4o"
    pattern = "*"
    codeflash_output = not _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 14.5μs -> 14.0μs (3.38% faster)

def test_model_with_provider_and_model_star_pattern():
    """Test model with provider and model star pattern."""
    model = "openai/gpt-4o"
    pattern = "openai/*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 13.4μs -> 13.1μs (2.49% faster)

def test_model_with_provider_and_wrong_model_star_pattern():
    """Test model with provider and wrong model star pattern."""
    model = "openai/gpt-4o"
    pattern = "anthropic/*"
    codeflash_output = not _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 13.2μs -> 13.0μs (1.55% faster)

def test_model_with_provider_and_exact_model_pattern():
    """Test model with provider and exact model pattern."""
    model = "openai/gpt-4o"
    pattern = "openai/gpt-4o"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 10.6μs -> 10.0μs (5.86% faster)

def test_model_with_provider_and_non_matching_exact_model_pattern():
    """Test model with provider and non-matching exact model pattern."""
    model = "openai/gpt-4o"
    pattern = "openai/gpt-3.5-turbo"
    codeflash_output = not _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 10.3μs -> 9.88μs (4.01% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import re
# Patch the litellm module
import sys

# imports
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import \
    _model_custom_llm_provider_matches_wildcard_pattern

# --- Unit tests ---

# Basic Test Cases
@pytest.mark.parametrize(
    "model, allowed_model_pattern, expected",
    [
        # OpenAI chat model matches openai/*
        ("gpt-4o", "openai/*", True),
        # Anthropic model matches anthropic/*
        ("claude-3-5-sonnet-20240620", "anthropic/*", True),
        # Cohere chat model matches cohere_chat/*
        ("command-r", "cohere_chat/*", True),
        # Cohere model matches cohere/*
        ("command", "cohere/*", True),
        # Replicate model matches replicate/*
        ("meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3", "replicate/*", True),
        # Bedrock model matches bedrock/*
        ("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/*", True),
        # Openrouter model matches openrouter/*
        ("openrouter/llama-2", "openrouter/*", True),
        # Maritalk model matches maritalk/*
        ("maritalk/mt-1", "maritalk/*", True),
        # Vertex AI model matches vertex_ai/*
        ("gemini-1.5-pro", "vertex_ai/*", True),
        # AI21 model matches ai21_chat/*
        ("j2-ultra", "ai21_chat/*", True),
        # Aleph Alpha model matches aleph_alpha/*
        ("aleph-alpha/luminous-base", "aleph_alpha/*", True),
        # Baseten model matches baseten/*
        ("baseten/gpt-neo", "baseten/*", True),
        # NLP Cloud model matches nlp_cloud/*
        ("nlpcloud/gpt-j", "nlp_cloud/*", True),
        # Petals model matches petals/*
        ("petals/gpt2", "petals/*", True),
        # Empower model matches empower/*
        ("empower/empower-model", "empower/*", True),
        # Gradient AI model matches gradient_ai/*
        ("gradient_ai/gradient-model", "gradient_ai/*", True),
        # OpenAI embedding model matches openai/*
        ("text-embedding-ada-002", "openai/*", True),
        # Bytez model matches bytez/*
        ("bytez/model-xyz", "bytez/*", True),
        # Lemonade model matches lemonade/*
        ("lemonade/model-abc", "lemonade/*", True),
        # Heroku model matches heroku/*
        ("heroku/model-123", "heroku/*", True),
        # CometAPI model matches cometapi/*
        ("cometapi/model-456", "cometapi/*", True),
        # OCI model matches oci/*
        ("oci/model-789", "oci/*", True),
        # Compactifai model matches compactifai/*
        ("compactifai/model-101", "compactifai/*", True),
        # OVHCloud model matches ovhcloud/*
        ("ovhcloud/model-202", "ovhcloud/*", True),
        # Clarifai model matches clarifai/*
        ("clarifai/model-303", "clarifai/*", True),
        # Wildcard model matches openai/*
        ("*", "openai/*", True),
        # OpenAI text completion model matches text-completion-openai/*
        ("text-davinci-003", "text-completion-openai/*", True),
        # Anthropic text model matches anthropic_text/*
        ("claude-2", "anthropic_text/*", True),
        # Anthropic text model matches anthropic/*
        ("claude-2", "anthropic/*", False),  # Should not match, provider is anthropic_text
        # OpenAI chat model does not match anthropic/*
        ("gpt-4o", "anthropic/*", False),
        # Cohere chat model does not match cohere/*
        ("command-r", "cohere/*", False),
        # Unknown model does not match openai/*
        ("unknown-model", "openai/*", False),
        # OpenAI model matches openai/gpt-4o exactly
        ("gpt-4o", "openai/gpt-4o", False),  # pattern is not a wildcard
        # OpenAI model matches openai/* with extra chars
        ("gpt-4o", "openai/*extra", False),
    ]
)
def test_basic_model_custom_llm_provider_matches_wildcard_pattern(model, allowed_model_pattern, expected):
    # Basic functional tests for matching
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, allowed_model_pattern); result = codeflash_output # 1.40ms -> 1.38ms (1.95% faster)

# Edge Test Cases
@pytest.mark.parametrize(
    "model, allowed_model_pattern, expected",
    [
        # Empty model string, should not match any pattern
        ("", "openai/*", False),
        # Empty pattern string, should not match
        ("gpt-4o", "", False),
        # Pattern with only wildcard
        ("gpt-4o", "*", True),
        # Model with slash but unknown provider
        ("unknownprovider/model-x", "unknownprovider/*", False),
        # Model with slash and known provider, but unknown model
        ("openai/unknown-model", "openai/*", True),  # provider is openai, pattern matches
        # Model with extra slashes
        ("openai/gpt-4o/x", "openai/*", True),
        # Model with special characters
        ("openai/gpt-4o$", "openai/*", True),
        # Model with unicode
        ("openai/模型", "openai/*", True),
        # Pattern with regex-like chars
        ("gpt-4o", "openai/gpt-4.", False),  # pattern is not wildcard
        # Model and pattern both empty
        ("", "", False),
        # Model with only wildcard
        ("*", "*", True),
        # Pattern with multiple wildcards
        ("openai/gpt-4o", "openai/*o*", True),
        # Model matches pattern with trailing wildcard
        ("openai/gpt-4o", "openai/gpt-*", True),
        # Model matches pattern with leading wildcard
        ("openai/gpt-4o", "*gpt-4o", True),
        # Model matches pattern with wildcard in middle
        ("openai/gpt-4o", "openai/*4*", True),
        # Model with provider in pattern but not in model
        ("gpt-4o", "openai/gpt-4o", False),
        # Model with provider in model but not in pattern
        ("openai/gpt-4o", "gpt-4o", False),
        # Model with provider and model, pattern is just provider
        ("openai/gpt-4o", "openai", False),
        # Model with provider and model, pattern is just model
        ("openai/gpt-4o", "gpt-4o", False),
    ]
)
def test_edge_model_custom_llm_provider_matches_wildcard_pattern(model, allowed_model_pattern, expected):
    # Edge case tests for matching
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, allowed_model_pattern); result = codeflash_output # 696μs -> 690μs (0.928% faster)

# Large Scale Test Cases
def test_large_scale_model_custom_llm_provider_matches_wildcard_pattern():
    # Generate 500 models with provider/model format, all should match provider/*
    for i in range(500):
        model = f"openai/gpt-model-{i}"
        pattern = "openai/*"
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.41ms -> 2.26ms (6.27% faster)

    # Generate 500 models with provider/model format, none should match a different provider/*
    for i in range(500):
        model = f"openai/gpt-model-{i}"
        pattern = "anthropic/*"
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.36ms -> 2.22ms (6.25% faster)

    # Generate 500 models with random providers, only correct provider should match
    providers = ["openai", "anthropic", "cohere", "bedrock"]
    for i in range(500):
        provider = providers[i % len(providers)]
        model = f"{provider}/model-{i}"
        pattern = f"{provider}/*"
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.52ms -> 2.39ms (5.68% faster)

    # Generate 500 models with known models, all should match * pattern
    known_models = [
        "gpt-4o", "claude-3-5-sonnet-20240620", "command-r", "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3"
    ]
    for i in range(500):
        model = known_models[i % len(known_models)]
        pattern = "*"
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 2.74ms -> 2.31ms (18.4% faster)

# Mutation-sensitive test: if the function ever returns True for a model not in the provider list, it should fail
def test_mutation_sensitivity():
    # Model with provider not in provider_list, should never match
    model = "notaprovider/model-x"
    pattern = "notaprovider/*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 174μs -> 172μs (1.62% faster)

    # Model with known provider but unknown model, should match due to provider
    model = "openai/unknown-model"
    pattern = "openai/*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 12.9μs -> 11.9μs (8.41% faster)

    # Model with known provider and known model, should match
    model = "openai/gpt-4o"
    pattern = "openai/*"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 7.38μs -> 7.03μs (4.91% faster)

    # Model with known provider and known model, pattern is specific model
    model = "openai/gpt-4o"
    pattern = "openai/gpt-4o"
    codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern(model, pattern) # 4.80μs -> 4.46μs (7.58% faster)

# Determinism test: same input always yields same result
def test_determinism():
    for _ in range(10):
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern("gpt-4o", "openai/*") # 90.1μs -> 82.3μs (9.50% faster)
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern("gpt-4o", "anthropic/*")
        codeflash_output = _model_custom_llm_provider_matches_wildcard_pattern("unknown-model", "openai/*") # 62.0μs -> 55.8μs (11.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_model_custom_llm_provider_matches_wildcard_pattern-mhx2m80v and push.

Codeflash Static Badge

The optimized code achieves a **7% speedup** by reducing redundant string operations and improving dictionary-based lookups, particularly beneficial for the authentication pattern matching hot path.

## Key Optimizations

**1. Eliminated Redundant String Splitting**
- Cached `model.split("/", 1)` result in `split_model` variable, reusing it throughout the function
- Extracted `model_prefix = split_model[0]` to avoid repeated indexing
- Pre-computed `num_parts = len(split_model)` for length checks
- This reduces multiple expensive string split operations from ~6 calls to 1 call per invocation

**2. Pre-computed Provider Membership Checks**
- Cached `model_prefix in litellm.provider_list` as `model_prefix_in_provider`
- Cached `model_prefix not in litellm.model_list_set` as `model_prefix_not_in_model_set`
- These boolean results are reused in subsequent conditionals, avoiding repeated set membership tests

**3. Optimized API Base Endpoint Matching**
- Replaced sequential `if/elif` chain with a dictionary mapping (`endpoint_map`) for most static endpoints
- Dictionary lookups are O(1) vs O(n) sequential string comparisons
- Preserved complex logic for endpoints requiring special handling (like codestral with dual providers)

## Performance Impact

The optimizations particularly benefit the **authentication hot path** where `get_llm_provider` is called frequently through `_model_custom_llm_provider_matches_wildcard_pattern`. Based on the function reference, this is used in `_model_matches_any_wildcard_pattern_in_list` for validating model patterns against allowed lists - a critical security check that runs on every API request.

**Test Results Show Consistent Gains:**
- Basic pattern matching: 3-8% faster
- Large-scale operations (500+ models): 5-22% faster  
- Complex wildcard patterns: 6% faster

The optimization is especially effective for workloads with frequent model validation, model provider lookups, or batch processing scenarios where the same function is called repeatedly with different models.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 06:51
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant