Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 21% (0.21x) speedup for _can_object_call_model in litellm/proxy/auth/auth_checks.py

⏱️ Runtime : 159 milliseconds 132 milliseconds (best of 12 runs)

📝 Explanation and details

The optimized code achieves a 20% speedup through several key micro-optimizations in the _check_model_access_helper function:

Primary Optimization - Conditional List Filtering:
The original code always created filtered_models by filtering out access groups, even when access_groups was empty. The optimized version only performs this expensive list comprehension when access groups actually exist:

# Original: Always filters
filtered_models = [m for m in models if m not in access_groups]

# Optimized: Only filters when necessary  
access_group_keys = set(access_groups.keys()) if access_groups else set()
filtered_models = [m for m in models if m not in access_group_keys] if access_group_keys else models

Performance Impact:

  • When no access groups exist (common case), avoids O(n*m) list comprehension entirely
  • When access groups do exist, uses set lookup (O(1)) instead of dict key lookup (O(1) but with higher overhead)

Secondary Optimization - Set-based Lookups:
Replaced repeated list membership checks with set-based operations:

# Original: Multiple O(n) list lookups
if "*" in filtered_models:
if model not in filtered_models:

# Optimized: Single set creation, then O(1) lookups
filtered_model_set = set(filtered_models)
if "*" in filtered_model_set:
if model not in filtered_model_set:

Why This Matters:
This function is called in authentication hot paths from can_key_call_model, can_team_access_model, can_user_call_model, and can_org_access_model - all critical for API request validation. The test results show consistent 5-20% improvements across various scenarios, with the largest gains (20.9%) in the test_large_scale_list_of_models_all_allowed case where the conditional filtering optimization has maximum impact.

The optimizations are particularly effective for deployments with many models but few/no access groups, which represents typical production configurations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 70.6%
🌀 Generated Regression Tests and Runtime
from typing import Dict, List, Literal, Optional, Union

# imports
import pytest
from litellm.proxy.auth.auth_checks import _can_object_call_model

# --- Minimal stubs for dependencies ---

class ProxyException(Exception):
    def __init__(self, message, type, param, code):
        super().__init__(message)
        self.type = type
        self.param = param
        self.code = code

class SpecialModelNames:
    all_proxy_models = type("Enum", (), {"value": "__all_proxy_models__"})

DEFAULT_MAX_RECURSE_DEPTH = 10

# --- litellm stub ---

class litellm:
    model_alias_map = {}

# --- Unit Tests ---

# Basic Test Cases

def test_basic_direct_model_access():
    # User allowed to access direct model
    codeflash_output = _can_object_call_model(
        model="gpt-3.5-turbo",
        llm_router=None,
        models=["gpt-3.5-turbo"],
    ) # 8.56μs -> 8.05μs (6.32% faster)



def test_basic_team_model_alias_access():
    # User allowed to access via team_model_aliases
    codeflash_output = _can_object_call_model(
        model="gpt-4o",
        llm_router=None,
        models=["other-model"],
        team_model_aliases={"gpt-4o": "gpt-4o-team-1"},
    ) # 7.25μs -> 6.43μs (12.7% faster)

def test_basic_wildcard_access():
    # User allowed to access via wildcard pattern
    codeflash_output = _can_object_call_model(
        model="bedrock/us.amazon.nova-micro-v1:0",
        llm_router=None,
        models=["bedrock/*"],
    ) # 12.4μs -> 12.5μs (0.903% slower)

def test_basic_all_access_star():
    # User allowed to access any model via "*"
    codeflash_output = _can_object_call_model(
        model="any-model",
        llm_router=None,
        models=["*"],
    ) # 10.7μs -> 10.2μs (4.39% faster)


def test_basic_list_of_models():
    # User allowed to access all models in list
    codeflash_output = _can_object_call_model(
        model=["gpt-3.5-turbo", "gpt-4"],
        llm_router=None,
        models=["gpt-3.5-turbo", "gpt-4"],
    ) # 17.1μs -> 16.3μs (4.44% faster)



def test_edge_empty_models_and_filtered_models():
    # models empty, filtered_models empty, should allow all
    codeflash_output = _can_object_call_model(
        model="gpt-4",
        llm_router=None,
        models=[],
    ) # 8.23μs -> 7.56μs (8.82% faster)









def test_edge_model_matches_custom_wildcard():
    # allowed_model_list contains custom wildcard pattern
    codeflash_output = _can_object_call_model(
        model="bedrock/us.amazon.nova-micro-v1:0",
        llm_router=None,
        models=["bedrock/us.*"],
    ) # 13.5μs -> 14.2μs (5.06% slower)

def test_edge_models_is_none():
    # models is None, should raise
    with pytest.raises(TypeError):
        _can_object_call_model(
            model="gpt-3.5-turbo",
            llm_router=None,
            models=None,
        ) # 5.54μs -> 6.07μs (8.66% slower)

# Large Scale Test Cases

def test_large_scale_many_models_allowed():
    # User allowed to access from large allowed model list
    allowed_models = [f"model-{i}" for i in range(1000)]
    codeflash_output = _can_object_call_model(
        model="model-999",
        llm_router=None,
        models=allowed_models,
    ) # 177μs -> 167μs (5.97% faster)


def test_large_scale_list_of_models_all_allowed():
    # All models in list are allowed
    allowed_models = [f"model-{i}" for i in range(1000)]
    models_to_test = [f"model-{i}" for i in range(1000)]
    codeflash_output = _can_object_call_model(
        model=models_to_test,
        llm_router=None,
        models=allowed_models,
    ) # 156ms -> 129ms (20.9% faster)


def test_large_scale_wildcard_access():
    # Wildcard allows all models
    allowed_models = ["model-*"]
    for i in range(0, 1000, 100):  # test every 100th model
        codeflash_output = _can_object_call_model(
            model=f"model-{i}",
            llm_router=None,
            models=allowed_models,
        ) # 41.7μs -> 40.7μs (2.50% faster)


def test_large_scale_team_model_aliases():
    # Large team_model_aliases mapping
    team_model_aliases = {f"model-{i}": f"team-model-{i}" for i in range(1000)}
    for i in range(0, 1000, 100):
        codeflash_output = _can_object_call_model(
            model=f"model-{i}",
            llm_router=None,
            models=["not-allowed-model"],
            team_model_aliases=team_model_aliases,
        ) # 21.1μs -> 19.2μs (10.2% faster)

def test_large_scale_list_of_models_with_wildcard():
    # List of models, all match wildcard, should allow
    allowed_models = ["model-*"]
    models_to_test = [f"model-{i}" for i in range(1000)]
    codeflash_output = _can_object_call_model(
        model=models_to_test,
        llm_router=None,
        models=allowed_models,
    ) # 2.27ms -> 2.10ms (8.15% faster)
from typing import Dict, List, Literal, Optional, Union

# imports
import pytest
from litellm.proxy.auth.auth_checks import _can_object_call_model


class ProxyException(Exception):
    def __init__(self, message, type, param, code):
        super().__init__(message)
        self.message = message
        self.type = type
        self.param = param
        self.code = code

class SpecialModelNames:
    all_proxy_models = type('Enum', (), {'value': '__all_proxy_models__'})

DEFAULT_MAX_RECURSE_DEPTH = 5

# Minimal litellm global with model_alias_map
class _Litellm:
    model_alias_map = {}

litellm = _Litellm()

# ------------------- UNIT TESTS -------------------

# 1. Basic Test Cases

def test_basic_direct_access():
    # Allowed model is directly in the list
    codeflash_output = _can_object_call_model("gpt-3", None, ["gpt-3"]) # 10.9μs -> 10.4μs (4.47% faster)


def test_basic_wildcard_access():
    # Wildcard '*' allows all models
    codeflash_output = _can_object_call_model("gpt-4", None, ["*"]) # 13.5μs -> 13.5μs (0.126% faster)
    codeflash_output = _can_object_call_model("foo-bar", None, ["*"]) # 4.92μs -> 4.55μs (8.07% faster)

def test_basic_team_model_alias():
    # Model is in team_model_aliases
    codeflash_output = _can_object_call_model("gpt-4o", None, ["foo"], team_model_aliases={"gpt-4o": "gpt-4o-team-1"}) # 5.41μs -> 4.64μs (16.6% faster)






def test_edge_empty_models_list_with_star():
    # If models list is empty but '*' is present, allow all
    codeflash_output = _can_object_call_model("gpt-3", None, ["*"]) # 13.5μs -> 13.2μs (1.77% faster)





def test_large_many_team_model_aliases():
    # 500 team aliases, test access for one
    team_aliases = {f"alias-{i}": f"real-{i}" for i in range(500)}
    codeflash_output = _can_object_call_model("alias-123", None, ["foo"], team_model_aliases=team_aliases) # 7.58μs -> 6.67μs (13.7% faster)




def test_large_multiple_object_types():
    # Try all object_types with large allowed list
    allowed = [f"model-{i}" for i in range(100)]
    for ot in ["user", "team", "key", "org"]:
        codeflash_output = _can_object_call_model(f"model-99", None, allowed, object_type=ot) # 88.6μs -> 81.2μs (9.12% faster)

def test_large_models_with_star_and_special():
    # Large allowed list including '*' and special model name
    allowed = [f"model-{i}" for i in range(100)] + ["*", SpecialModelNames.all_proxy_models.value]
    codeflash_output = _can_object_call_model("anything", None, allowed) # 21.8μs -> 17.2μs (27.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_can_object_call_model-mhx00tg1 and push.

Codeflash Static Badge

The optimized code achieves a **20% speedup** through several key micro-optimizations in the `_check_model_access_helper` function:

**Primary Optimization - Conditional List Filtering:**
The original code always created `filtered_models` by filtering out access groups, even when `access_groups` was empty. The optimized version only performs this expensive list comprehension when access groups actually exist:
```python
# Original: Always filters
filtered_models = [m for m in models if m not in access_groups]

# Optimized: Only filters when necessary  
access_group_keys = set(access_groups.keys()) if access_groups else set()
filtered_models = [m for m in models if m not in access_group_keys] if access_group_keys else models
```

**Performance Impact:**
- When no access groups exist (common case), avoids O(n*m) list comprehension entirely
- When access groups do exist, uses set lookup (O(1)) instead of dict key lookup (O(1) but with higher overhead)

**Secondary Optimization - Set-based Lookups:**
Replaced repeated list membership checks with set-based operations:
```python
# Original: Multiple O(n) list lookups
if "*" in filtered_models:
if model not in filtered_models:

# Optimized: Single set creation, then O(1) lookups
filtered_model_set = set(filtered_models)
if "*" in filtered_model_set:
if model not in filtered_model_set:
```

**Why This Matters:**
This function is called in authentication hot paths from `can_key_call_model`, `can_team_access_model`, `can_user_call_model`, and `can_org_access_model` - all critical for API request validation. The test results show consistent 5-20% improvements across various scenarios, with the largest gains (20.9%) in the `test_large_scale_list_of_models_all_allowed` case where the conditional filtering optimization has maximum impact.

The optimizations are particularly effective for deployments with many models but few/no access groups, which represents typical production configurations.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 05:39
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant