Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 165% (1.65x) speedup for can_org_access_model in litellm/proxy/auth/auth_checks.py

⏱️ Runtime : 2.23 milliseconds 843 microseconds (best of 9 runs)

📝 Explanation and details

The optimized code achieves a 165% speedup through three key optimizations targeting redundant work and exception handling inefficiencies:

1. Early-exit optimization for model lists: The original code would recursively validate each model in a list and always return True if all passed. The optimized version uses a try-catch pattern to return True immediately upon finding the first allowed model, avoiding unnecessary validation of remaining models in the list.

2. Duplicate model elimination: Added a set() to track already-checked models, preventing redundant calls to _check_model_access_helper() when aliases resolve to the same underlying model. This is particularly effective since the line profiler shows _check_model_access_helper() consuming 98%+ of execution time.

3. Smarter alias resolution: Uses litellm.model_alias_map.get(model) instead of model in litellm.model_alias_map followed by dictionary access, reducing lookups. Also adds checks to ensure aliases differ from the original model before adding them to the potential models list.

Performance impact analysis: The test results show this optimization is most effective for:

  • Large model lists (769% faster for 10-model lists vs 1000-model org)
  • Scenarios where early models in a list are allowed (22.5% faster for allowed model lists)
  • Cases with many duplicate aliases/models being checked

Context relevance: Based on the function reference showing can_org_access_model() being called in a loop during team organization validation, this optimization significantly reduces overhead when validating teams with multiple models against organization permissions - a common administrative operation that benefits greatly from the early-exit and deduplication improvements.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 38 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from litellm.proxy.auth.auth_checks import can_org_access_model


class ProxyException(Exception):
    def __init__(self, message, type, param, code):
        super().__init__(message)
        self.type = type
        self.param = param
        self.code = code

class SpecialModelNames:
    class all_proxy_models:
        value = "__all_proxy_models__"

DEFAULT_MAX_RECURSE_DEPTH = 10

# LiteLLM_OrganizationTable stub
class LiteLLM_OrganizationTable:
    def __init__(self, models):
        self.models = models

# Router stub with model_group_alias and get_model_access_groups
class Router:
    def __init__(self, model_group_alias=None, access_groups=None):
        self.model_group_alias = model_group_alias or {}
        self._access_groups = access_groups or {}

    def get_model_access_groups(self, model_name, team_id=None):
        # Returns access groups for the given model_name
        return self._access_groups.get(model_name, {})

    def _get_model_from_alias(self, model):
        if model in self.model_group_alias:
            alias = self.model_group_alias[model]
            if isinstance(alias, str):
                return alias
            elif isinstance(alias, dict) and "model" in alias:
                return alias["model"]
        return None

# litellm stubs
class litellm:
    model_alias_map = {}
from litellm.proxy.auth.auth_checks import can_org_access_model

# --- Unit tests ---

# 1. Basic Test Cases

def test_access_allowed_direct_match():
    # Org can access model directly listed
    org = LiteLLM_OrganizationTable(models=["gpt-3.5-turbo"])
    codeflash_output = can_org_access_model("gpt-3.5-turbo", org, None) # 12.6μs -> 12.9μs (2.27% slower)


def test_access_allowed_wildcard():
    # Org can access any model if "*" in models
    org = LiteLLM_OrganizationTable(models=["*"])
    codeflash_output = can_org_access_model("gpt-4", org, None) # 14.3μs -> 15.3μs (6.22% slower)
    codeflash_output = can_org_access_model("some-other-model", org, None) # 5.28μs -> 5.64μs (6.30% slower)



def test_access_allowed_team_model_aliases():
    # Org can access model if in team_model_aliases
    org = LiteLLM_OrganizationTable(models=["bert-base"])
    team_model_aliases = {"gpt-4": "bert-base"}
    codeflash_output = can_org_access_model("gpt-4", org, None, team_model_aliases=team_model_aliases) # 8.10μs -> 9.06μs (10.6% slower)


def test_access_allowed_router_model_group_alias():
    # Org can access model via router.model_group_alias
    router = Router(model_group_alias={"gpt-4-alias": "gpt-4"})
    org = LiteLLM_OrganizationTable(models=["gpt-4"])
    codeflash_output = can_org_access_model("gpt-4-alias", org, router) # 17.9μs -> 19.0μs (5.97% slower)


def test_access_allowed_empty_models_and_model_is_star():
    # Org can access any model if models is empty and model is "*"
    org = LiteLLM_OrganizationTable(models=[])
    codeflash_output = can_org_access_model("*", org, None) # 11.7μs -> 11.8μs (0.518% slower)

# 2. Edge Test Cases



def test_access_allowed_router_access_groups():
    # Org can access via router access groups
    router = Router(access_groups={"gpt-4": {"groupA": ["gpt-4"]}})
    org = LiteLLM_OrganizationTable(models=["groupA"])
    codeflash_output = can_org_access_model("gpt-4", org, router) # 8.11μs -> 8.81μs (7.93% slower)


def test_access_allowed_model_list_is_star_and_model_is_not_star():
    # Org can access any model if models contains "*"
    org = LiteLLM_OrganizationTable(models=["*"])
    codeflash_output = can_org_access_model("gpt-4", org, None) # 14.1μs -> 15.0μs (6.53% slower)


def test_access_allowed_model_empty_string_with_star():
    # Org can access empty string model if "*" is allowed
    org = LiteLLM_OrganizationTable(models=["*"])
    codeflash_output = can_org_access_model("", org, None) # 14.3μs -> 15.0μs (4.68% slower)

def test_access_allowed_model_is_list_all_allowed():
    # Org can access if all models in list are allowed
    org = LiteLLM_OrganizationTable(models=["gpt-4", "gpt-3.5-turbo"])
    codeflash_output = can_org_access_model(["gpt-4", "gpt-3.5-turbo"], org, None) # 15.1μs -> 12.3μs (22.5% faster)


def test_access_allowed_max_fallback_depth():
    # Org can access if fallback_depth is less than max
    org = LiteLLM_OrganizationTable(models=["gpt-4"])
    codeflash_output = can_org_access_model("gpt-4", org, None) # 12.8μs -> 12.7μs (0.924% faster)

def test_access_denied_exceed_max_fallback_depth():
    # Org cannot access if fallback_depth exceeds max
    org = LiteLLM_OrganizationTable(models=["gpt-4"])
    # Patch _can_object_call_model to simulate fallback_depth
    with pytest.raises(Exception):
        _can_object_call_model("gpt-4", None, org.models, fallback_depth=DEFAULT_MAX_RECURSE_DEPTH)



def test_access_allowed_none_org_object_and_model_is_star():
    # If org_object is None, but model is "*", allow
    codeflash_output = can_org_access_model("*", None, None) # 11.4μs -> 11.7μs (2.18% slower)

# 3. Large Scale Test Cases




def test_large_scale_model_is_list():
    # Org can access all models in a large list
    org = LiteLLM_OrganizationTable(models=[f"model-{i}" for i in range(1000)])
    model_list = [f"model-{i}" for i in range(10)]
    codeflash_output = can_org_access_model(model_list, org, None) # 1.59ms -> 183μs (769% faster)


import pytest
from litellm.proxy.auth.auth_checks import can_org_access_model

# --- Minimal stubs and helpers for dependencies and types ---

class ProxyException(Exception):
    def __init__(self, message, type=None, param=None, code=None):
        super().__init__(message)
        self.message = message
        self.type = type
        self.param = param
        self.code = code

class SpecialModelNames:
    all_proxy_models = type("EnumValue", (), {"value": "__all_proxy_models__"})

DEFAULT_MAX_RECURSE_DEPTH = 10

# Simulate litellm global state and alias map
class litellm:
    model_alias_map = {}
    cache = None
    suppress_debug_info = True
    fallbacks = []
    default_fallbacks = None
    num_retries = 2
    max_fallbacks = 5
    ROUTER_MAX_FALLBACKS = 5
    request_timeout = 10
    context_window_fallbacks = []
    content_policy_fallbacks = []
    allowed_fails = 3
    _async_success_callback = []
    success_callback = []
    _async_failure_callback = []
    failure_callback = []
    default_fallbacks = None

    class Cache:
        def __init__(self, type, **kwargs):
            pass

    class Chat:
        def __init__(self, params, router_obj):
            pass

    class logging_callback_manager:
        @staticmethod
        def add_litellm_async_success_callback(cb): pass
        @staticmethod
        def add_litellm_success_callback(cb): pass
        @staticmethod
        def add_litellm_async_failure_callback(cb): pass
        @staticmethod
        def add_litellm_failure_callback(cb): pass

# Minimal stub for LiteLLM_OrganizationTable
class LiteLLM_OrganizationTable:
    def __init__(self, models=None):
        self.models = models if models is not None else []

# Minimal stub for Router
class Router:
    def __init__(self, model_group_alias=None, access_groups=None):
        # model_group_alias: dict[str, str] mapping alias -> real model
        self.model_group_alias = model_group_alias or {}
        self.access_groups = access_groups or {}

    def get_model_access_groups(self, model_name, team_id=None):
        # Returns a dict mapping group name -> list of models for the given model_name.
        # For test, just return self.access_groups.get(model_name, {})
        return self.access_groups.get(model_name, {})

    def _get_model_from_alias(self, model):
        # Return the real model name for an alias, or None.
        if model in self.model_group_alias:
            return self.model_group_alias[model]
        return None
from litellm.proxy.auth.auth_checks import can_org_access_model

# --- Unit tests start here ---

# -------------------- BASIC TEST CASES --------------------

def test_org_can_access_model_direct_match():
    """Test: org can access model if model is in org.models (direct string match)."""
    org = LiteLLM_OrganizationTable(models=["gpt-3.5-turbo", "gpt-4"])
    router = Router()
    codeflash_output = can_org_access_model("gpt-3.5-turbo", org, router) # 13.0μs -> 13.5μs (4.05% slower)
    codeflash_output = can_org_access_model("gpt-4", org, router) # 4.88μs -> 5.62μs (13.1% slower)


def test_org_can_access_model_with_wildcard():
    """Test: org can access model if wildcard '*' is in org.models."""
    org = LiteLLM_OrganizationTable(models=["*"])
    router = Router()
    codeflash_output = can_org_access_model("gpt-3.5-turbo", org, router) # 15.3μs -> 16.4μs (6.77% slower)
    codeflash_output = can_org_access_model("random-model", org, router) # 5.62μs -> 6.20μs (9.33% slower)





def test_org_can_access_model_via_router_model_group_alias():
    """Test: org can access model via router model_group_alias mapping."""
    org = LiteLLM_OrganizationTable(models=["gpt-3.5-turbo"])
    router = Router(model_group_alias={"gpt-4-alias": "gpt-3.5-turbo"})
    # gpt-4-alias is not in org.models, but resolves to gpt-3.5-turbo
    codeflash_output = can_org_access_model("gpt-4-alias", org, router) # 17.3μs -> 18.7μs (7.44% slower)








def test_model_access_group_allows_access():
    """Test: router access_groups allow access via group membership."""
    org = LiteLLM_OrganizationTable(models=["group1"])
    router = Router(access_groups={"gpt-4": {"group1": ["gpt-4", "gpt-3.5-turbo"]}})
    codeflash_output = can_org_access_model("gpt-4", org, router) # 8.17μs -> 9.01μs (9.26% slower)


def test_max_fallback_depth_exceeded():
    """Test: fallback_depth exceeding DEFAULT_MAX_RECURSE_DEPTH raises Exception."""
    org = LiteLLM_OrganizationTable(models=["gpt-3.5-turbo"])
    router = Router()
    # Patch DEFAULT_MAX_RECURSE_DEPTH to 1 for this test
    global DEFAULT_MAX_RECURSE_DEPTH
    old_depth = DEFAULT_MAX_RECURSE_DEPTH
    DEFAULT_MAX_RECURSE_DEPTH = 1
    try:
        with pytest.raises(Exception) as excinfo:
            _can_object_call_model(
                model=["a", "b", "c"],
                llm_router=router,
                models=org.models,
                object_type="org",
                fallback_depth=1,
            )
    finally:
        DEFAULT_MAX_RECURSE_DEPTH = old_depth



def test_org_models_is_none_and_team_model_aliases_allows():
    """Test: org.models is None but team_model_aliases allows access."""
    org = LiteLLM_OrganizationTable(models=None)
    router = Router()
    team_model_aliases = {"alias1": "gpt-4"}
    codeflash_output = can_org_access_model("gpt-4", org, router, team_model_aliases=team_model_aliases) # 12.3μs -> 13.2μs (6.46% slower)

def test_org_models_empty_and_team_model_aliases_allows():
    """Test: org.models is empty but team_model_aliases allows access."""
    org = LiteLLM_OrganizationTable(models=[])
    router = Router()
    team_model_aliases = {"alias1": "gpt-4"}
    codeflash_output = can_org_access_model("gpt-4", org, router, team_model_aliases=team_model_aliases) # 9.78μs -> 10.5μs (6.88% slower)



def test_large_org_models_with_wildcard():
    """Test: org.models with 999 models plus wildcard, should allow all models."""
    models = [f"model-{i}" for i in range(999)] + ["*"]
    org = LiteLLM_OrganizationTable(models=models)
    router = Router()
    for name in ["model-0", "model-998", "foo-bar", "random"]:
        codeflash_output = can_org_access_model(name, org, router) # 380μs -> 383μs (0.659% slower)

def test_large_team_model_aliases():
    """Test: large team_model_aliases dict allows access to all aliases."""
    org = LiteLLM_OrganizationTable(models=["gpt-3.5-turbo"])
    router = Router()
    team_model_aliases = {f"alias-{i}": f"model-{i}" for i in range(999)}
    for i in range(0, 999, 100):
        codeflash_output = can_org_access_model(f"alias-{i}", org, router, team_model_aliases=team_model_aliases) # 24.2μs -> 28.3μs (14.3% slower)

def test_large_access_groups():
    """Test: router with large access_groups mapping allows access via group membership."""
    access_groups = {}
    # For model-0, group0 contains model-0, ..., model-999
    access_groups["model-0"] = {"group0": [f"model-{i}" for i in range(1000)]}
    org = LiteLLM_OrganizationTable(models=["group0"])
    router = Router(access_groups=access_groups)
    codeflash_output = can_org_access_model("model-0", org, router) # 5.79μs -> 5.89μs (1.65% slower)


To edit these changes git checkout codeflash/optimize-can_org_access_model-mhx0iord and push.

Codeflash Static Badge

The optimized code achieves a **165% speedup** through three key optimizations targeting redundant work and exception handling inefficiencies:

**1. Early-exit optimization for model lists**: The original code would recursively validate each model in a list and always return `True` if all passed. The optimized version uses a try-catch pattern to return `True` immediately upon finding the first allowed model, avoiding unnecessary validation of remaining models in the list.

**2. Duplicate model elimination**: Added a `set()` to track already-checked models, preventing redundant calls to `_check_model_access_helper()` when aliases resolve to the same underlying model. This is particularly effective since the line profiler shows `_check_model_access_helper()` consuming 98%+ of execution time.

**3. Smarter alias resolution**: Uses `litellm.model_alias_map.get(model)` instead of `model in litellm.model_alias_map` followed by dictionary access, reducing lookups. Also adds checks to ensure aliases differ from the original model before adding them to the potential models list.

**Performance impact analysis**: The test results show this optimization is most effective for:
- Large model lists (769% faster for 10-model lists vs 1000-model org)  
- Scenarios where early models in a list are allowed (22.5% faster for allowed model lists)
- Cases with many duplicate aliases/models being checked

**Context relevance**: Based on the function reference showing `can_org_access_model()` being called in a loop during team organization validation, this optimization significantly reduces overhead when validating teams with multiple models against organization permissions - a common administrative operation that benefits greatly from the early-exit and deduplication improvements.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 05:52
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant