Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 27% (0.27x) speedup for can_user_call_model in litellm/proxy/auth/auth_checks.py

⏱️ Runtime : 6.73 milliseconds 5.29 milliseconds (best of 12 runs)

📝 Explanation and details

The optimized code achieves a 27% runtime improvement through three key optimizations that reduce unnecessary work in auth validation:

1. Early Return for Global Access

  • Added upfront check: if not models or "*" in models or SpecialModelNames.all_proxy_models.value in models: return True
  • This bypasses expensive _check_model_access_helper calls when users have unrestricted access
  • Line profiler shows 541 early returns saved significant time (line 45: 101,705ns vs expensive helper calls)

2. Optimized Dictionary Lookups

  • Changed if model in litellm.model_alias_map: + litellm.model_alias_map[model] to single litellm.model_alias_map.get(model)
  • Eliminates redundant dictionary lookup, reducing from 2 hash operations to 1
  • Added getattr(llm_router, "model_group_alias", None) to avoid repeated attribute access

3. Streamlined Membership Checks

  • In can_user_call_model, replaced direct in operation with hasattr(user_models, "__contains__") check
  • This optimization prepares for cases where user_models might be a set (O(1) lookup) rather than list (O(n))

Impact Analysis:

  • The function is called from common_checks() in authentication hot path, making these micro-optimizations valuable
  • Line profiler shows _check_model_access_helper calls reduced from 1,846 to 764 hits (58% reduction)
  • Total function time improved from 24.8ms to 19.0ms
  • Most effective for workloads with wildcard permissions or global access patterns, as evidenced by test cases with "*" and SpecialModelNames.all_proxy_models.value

The optimizations maintain identical behavior and error handling while eliminating redundant computations in the authorization pipeline.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 13 Passed
🌀 Generated Regression Tests 843 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
🌀 Generated Regression Tests and Runtime
import asyncio  # used to run async functions
import sys
import types

import litellm
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import can_user_call_model

# --- Minimal stubs/mocks for dependencies ---

# Simulate litellm.constants.DEFAULT_MAX_RECURSE_DEPTH
DEFAULT_MAX_RECURSE_DEPTH = 5

# Simulate SpecialModelNames enum-like object
class SpecialModelNames:
    all_proxy_models = type("EnumVal", (), {"value": "all_proxy_models"})
    no_default_models = type("EnumVal", (), {"value": "no_default_models"})

# Simulate ProxyErrorTypes with required property
class ProxyErrorTypes:
    key_model_access_denied = "key_model_access_denied"
    @staticmethod
    def get_model_access_error_type_for_object(object_type):
        return f"{object_type}_model_access_denied"

# Simulate ProxyException
class ProxyException(Exception):
    def __init__(self, message, type, param, code):
        super().__init__(message)
        self.message = message
        self.type = type
        self.param = param
        self.code = code

# Simulate status.HTTP_401_UNAUTHORIZED
class status:
    HTTP_401_UNAUTHORIZED = 401
litellm.model_alias_map = {}

# Simulate _model_in_team_aliases and _model_matches_any_wildcard_pattern_in_list
def _model_in_team_aliases(model, team_model_aliases):
    if not team_model_aliases:
        return False
    return model in team_model_aliases

# Simulate Router class with required methods
class Router:
    def __init__(self, access_groups=None, model_group_alias=None):
        self._access_groups = access_groups or {}
        self.model_group_alias = model_group_alias or {}

    def get_model_access_groups(self, model_name, team_id=None):
        # Returns a dict of access groups for the model
        return self._access_groups.get(model_name, {})

    def _get_model_from_alias(self, model):
        return self.model_group_alias.get(model, None)

# Simulate LiteLLM_UserTable
class LiteLLM_UserTable:
    def __init__(self, models):
        self.models = models
from litellm.proxy.auth.auth_checks import can_user_call_model

# ------------------- TESTS BELOW -------------------

# -------- 1. BASIC TEST CASES --------

@pytest.mark.asyncio
async def test_can_user_call_model_returns_true_for_none_user():
    # Should always return True if user_object is None
    result = await can_user_call_model("model-a", None, None)

@pytest.mark.asyncio
async def test_can_user_call_model_returns_true_for_allowed_model():
    # User has explicit access to model
    user = LiteLLM_UserTable(models=["model-a", "model-b"])
    result = await can_user_call_model("model-a", None, user)

@pytest.mark.asyncio
async def test_can_user_call_model_returns_true_for_wildcard_access():
    # User has wildcard access, i.e. "*"
    user = LiteLLM_UserTable(models=["*"])
    result = await can_user_call_model("any-model", None, user)

@pytest.mark.asyncio

async def test_can_user_call_model_returns_true_for_model_alias_in_litellm():
    # If model is an alias in litellm.model_alias_map, and user has access to the mapped model
    litellm.model_alias_map["alias-model"] = "real-model"
    user = LiteLLM_UserTable(models=["real-model"])
    result = await can_user_call_model("alias-model", None, user)
    del litellm.model_alias_map["alias-model"]

@pytest.mark.asyncio

async def test_can_user_call_model_returns_true_for_model_group_alias_in_router():
    # If model is in router.model_group_alias and user has access to the mapped model
    router = Router(model_group_alias={"alias-m": "real-m"})
    user = LiteLLM_UserTable(models=["real-m"])
    result = await can_user_call_model("alias-m", router, user)

# -------- 2. EDGE TEST CASES --------

@pytest.mark.asyncio


async def test_can_user_call_model_allows_access_for_model_in_access_group():
    # If router.get_model_access_groups returns a dict with a key matching a user's model, access is granted
    router = Router(access_groups={"model-x": {"group1": ["model-x"]}})
    user = LiteLLM_UserTable(models=["group1"])
    result = await can_user_call_model("model-x", router, user)

@pytest.mark.asyncio
async def test_can_user_call_model_denies_access_with_max_fallback_depth():
    # Test recursion depth exceeded for model list
    user = LiteLLM_UserTable(models=["model-a"])
    # Patch DEFAULT_MAX_RECURSE_DEPTH to 2 for this test
    global DEFAULT_MAX_RECURSE_DEPTH
    old_depth = DEFAULT_MAX_RECURSE_DEPTH
    DEFAULT_MAX_RECURSE_DEPTH = 2
    try:
        models = ["m1", "m2", "m3"]
        with pytest.raises(Exception) as excinfo:
            await can_user_call_model(models, None, user)
    finally:
        DEFAULT_MAX_RECURSE_DEPTH = old_depth

@pytest.mark.asyncio
async def test_can_user_call_model_allows_access_for_model_matching_wildcard():
    # User has access to models matching a wildcard pattern
    user = LiteLLM_UserTable(models=["abc-*"])
    result = await can_user_call_model("abc-123", None, user)

@pytest.mark.asyncio

async def test_can_user_call_model_large_scale_concurrent_allowed():
    # Test 100 concurrent allowed calls
    user = LiteLLM_UserTable(models=["*"])
    tasks = [can_user_call_model(f"model-{i}", None, user) for i in range(100)]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test_can_user_call_model_large_scale_concurrent_denied():
    # Test 100 concurrent denied calls
    user = LiteLLM_UserTable(models=["foo"])
    tasks = [can_user_call_model(f"model-{i}", None, user) for i in range(100)]
    # Use return_exceptions=True to gather all exceptions
    results = await asyncio.gather(*tasks, return_exceptions=True)

@pytest.mark.asyncio
async def test_can_user_call_model_large_scale_mixed():
    # 50 allowed, 50 denied
    user_allowed = LiteLLM_UserTable(models=["*"])
    user_denied = LiteLLM_UserTable(models=["foo"])
    tasks = []
    for i in range(50):
        tasks.append(can_user_call_model(f"model-{i}", None, user_allowed))
    for i in range(50, 100):
        tasks.append(can_user_call_model(f"model-{i}", None, user_denied))
    results = await asyncio.gather(*tasks, return_exceptions=True)

# -------- 4. THROUGHPUT TEST CASES --------

@pytest.mark.asyncio
async def test_can_user_call_model_throughput_small_load():
    # Small load: 10 concurrent calls, all allowed
    user = LiteLLM_UserTable(models=["*"])
    tasks = [can_user_call_model(f"model-{i}", None, user) for i in range(10)]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio

async def test_can_user_call_model_throughput_high_volume():
    # High volume: 200 concurrent calls, all allowed
    user = LiteLLM_UserTable(models=["*"])
    tasks = [can_user_call_model(f"model-{i}", None, user) for i in range(200)]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test_can_user_call_model_throughput_varied_patterns():
    # Throughput test with varied user patterns
    users = [
        LiteLLM_UserTable(models=["*"]),
        LiteLLM_UserTable(models=["foo"]),
        LiteLLM_UserTable(models=[SpecialModelNames.all_proxy_models.value]),
        LiteLLM_UserTable(models=["abc-*"]),
    ]
    models = ["foo", "bar", "abc-123", "xyz"]
    tasks = []
    for i in range(25):  # 100 calls
        for u, m in zip(users, models):
            tasks.append(can_user_call_model(m, None, u))
    results = await asyncio.gather(*tasks, return_exceptions=True)
    # First user ("*") and third ("all_proxy_models") always allowed, fourth ("abc-*") only for "abc-123"
    for idx, r in enumerate(results):
        user_idx = idx % 4
        model = models[user_idx]
        if user_idx == 0 or user_idx == 2:
            pass
        elif user_idx == 1:
            pass
        elif user_idx == 3:
            if model == "abc-123":
                pass
            else:
                pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import asyncio  # For running async functions and concurrency
import sys
import types

import pytest  # For writing unit tests
from litellm.proxy.auth.auth_checks import can_user_call_model

# --- Minimal stubs and constants to allow the function to run in isolation ---

class DummyProxyException(Exception):
    def __init__(self, message, type, param, code):
        super().__init__(message)
        self.message = message
        self.type = type
        self.param = param
        self.code = code

class DummyProxyErrorTypes:
    key_model_access_denied = "key_model_access_denied"
    @staticmethod
    def get_model_access_error_type_for_object(object_type):
        return f"{object_type}_model_access_denied"

class DummySpecialModelNames:
    no_default_models = type("Enum", (), {"value": "NO_DEFAULT_MODELS"})
    all_proxy_models = type("Enum", (), {"value": "ALL_PROXY_MODELS"})

class DummyStatus:
    HTTP_401_UNAUTHORIZED = 401

class DummyLiteLLM_UserTable:
    def __init__(self, models):
        self.models = models

dummy_litellm = types.SimpleNamespace(
    model_alias_map = {},
)
from litellm.proxy.auth.auth_checks import can_user_call_model

# --- Begin unit tests ---

# --- Basic Test Cases ---

@pytest.mark.asyncio
async def test_can_user_call_model_basic_allowed():
    """Test: user is allowed to access a model in their allowed models list."""
    user = DummyLiteLLM_UserTable(models=["gpt-3.5-turbo", "gpt-4"])
    # Allowed model
    result = await can_user_call_model("gpt-3.5-turbo", None, user)

@pytest.mark.asyncio

async def test_can_user_call_model_none_user_object():
    """Test: If user_object is None, always return True (no restrictions)."""
    result = await can_user_call_model("gpt-3.5-turbo", None, None)

@pytest.mark.asyncio

async def test_can_user_call_model_model_list_input_all_allowed():
    """Test: If model is a list, all must be allowed (all in user.models)."""
    user = DummyLiteLLM_UserTable(models=["gpt-3.5-turbo", "gpt-4"])
    result = await can_user_call_model(["gpt-3.5-turbo", "gpt-4"], None, user)

@pytest.mark.asyncio


async def test_can_user_call_model_with_wildcard():
    """Test: If user.models contains '*', any model is allowed."""
    user = DummyLiteLLM_UserTable(models=["*"])
    result = await can_user_call_model("any-random-model", None, user)

@pytest.mark.asyncio

async def test_can_user_call_model_empty_models_list():
    """Test: If user.models is empty, allow all models."""
    user = DummyLiteLLM_UserTable(models=[])
    result = await can_user_call_model("gpt-4", None, user)

@pytest.mark.asyncio


async def test_can_user_call_model_many_models_concurrent():
    """Test: Many concurrent calls with different users/models."""
    users = [DummyLiteLLM_UserTable(models=[f"model-{i}"]) for i in range(10)]
    async def call(idx):
        return await can_user_call_model(f"model-{idx}", None, users[idx])
    results = await asyncio.gather(*(call(i) for i in range(10)))

@pytest.mark.asyncio

async def test_can_user_call_model_throughput_small_load():
    """Throughput: Small batch of concurrent allowed calls."""
    user = DummyLiteLLM_UserTable(models=["gpt-3.5-turbo"])
    async def call():
        return await can_user_call_model("gpt-3.5-turbo", None, user)
    results = await asyncio.gather(*(call() for _ in range(10)))

@pytest.mark.asyncio

async def test_can_user_call_model_throughput_high_volume():
    """Throughput: High volume (but bounded) concurrent calls."""
    user = DummyLiteLLM_UserTable(models=["*"])
    async def call(i):
        # All models allowed due to wildcard
        return await can_user_call_model(f"model-{i}", None, user)
    results = await asyncio.gather(*(call(i) for i in range(100)))
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-can_user_call_model-mhx1k2yl and push.

Codeflash Static Badge

The optimized code achieves a **27% runtime improvement** through three key optimizations that reduce unnecessary work in auth validation:

**1. Early Return for Global Access**
- Added upfront check: `if not models or "*" in models or SpecialModelNames.all_proxy_models.value in models: return True`
- This bypasses expensive `_check_model_access_helper` calls when users have unrestricted access
- Line profiler shows 541 early returns saved significant time (line 45: 101,705ns vs expensive helper calls)

**2. Optimized Dictionary Lookups**
- Changed `if model in litellm.model_alias_map:` + `litellm.model_alias_map[model]` to single `litellm.model_alias_map.get(model)`
- Eliminates redundant dictionary lookup, reducing from 2 hash operations to 1
- Added `getattr(llm_router, "model_group_alias", None)` to avoid repeated attribute access

**3. Streamlined Membership Checks**
- In `can_user_call_model`, replaced direct `in` operation with `hasattr(user_models, "__contains__")` check
- This optimization prepares for cases where `user_models` might be a set (O(1) lookup) rather than list (O(n))

**Impact Analysis:**
- The function is called from `common_checks()` in authentication hot path, making these micro-optimizations valuable
- Line profiler shows `_check_model_access_helper` calls reduced from 1,846 to 764 hits (58% reduction)
- Total function time improved from 24.8ms to 19.0ms
- Most effective for workloads with wildcard permissions or global access patterns, as evidenced by test cases with `"*"` and `SpecialModelNames.all_proxy_models.value`

The optimizations maintain identical behavior and error handling while eliminating redundant computations in the authorization pipeline.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 06:21
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant