Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Dec 1, 2025

OpenAI reasoning models (o1, o3, gpt-5 series excluding gpt-5-chat) only support temperature=1. When NeMo Guardrails uses .bind(temperature=0.001) for deterministic tasks like self-check input/output, the API returns an error:

Unsupported value: 'temperature' does not support 0.001 with this model.
Only the default (1) value is supported.

This happens because LangChain's ChatOpenAI handles temperature restrictions at initialization time (setting temperature=None for reasoning models), but .bind() bypasses this protection and passes the temperature directly to the API.

  • Added _filter_params_for_openai_reasoning_models() function that:
  • Detects reasoning models by name (o1*, o3*, gpt-5* excluding gpt-5-chat)
  • Removes the temperature parameter before binding for these models
  • Follows the same pattern as LangChain's validate_temperature validator

to repro

from langchain_openai import ChatOpenAI

# LangChain handles this at init by setting temperature=None
llm = ChatOpenAI(model='gpt-5-mini', temperature=0.1)
print(llm.temperature)  # None

# but .bind() bypasses this protection
bound = llm.bind(temperature=0.001)
bound.invoke(messages)  # Error!

or

```python
import asyncio
from langchain_openai import ChatOpenAI
from nemoguardrails.actions.llm.utils import llm_call

llm = ChatOpenAI(model="gpt-5-mini")

async def test():
    # This would fail before the fix because llm_call binds temperature=0.001
    result = await llm_call(
        llm,
        "Say hello",
        llm_params={"temperature": 0.001, "max_tokens": 100}
    )
    print(result)

asyncio.run(test())

Fixes: #1512 #992

OpenAI reasoning models (o1, o3, gpt-5 series excluding gpt-5-chat) only
support `temperature=1`. When NeMo Guardrails uses
`.bind(temperature=0.001)` for deterministic tasks like self-check
input/output, the API returns an error:

```
Unsupported value: 'temperature' does not support 0.001 with this model.
Only the default (1) value is supported.
```

This happens because LangChain's `ChatOpenAI` handles temperature
restrictions at initialization time (setting `temperature=None` for
reasoning models), but `.bind()` bypasses this protection and passes the
temperature directly to the API.

- Added `_filter_params_for_openai_reasoning_models()` function that:
- Detects reasoning models by name (o1*, o3*, gpt-5* excluding
gpt-5-chat)
- Removes the `temperature` parameter before binding for these models
- Follows the same pattern as LangChain's `validate_temperature`
validator
@Pouyanpi Pouyanpi self-assigned this Dec 1, 2025
@Pouyanpi Pouyanpi added the bug Something isn't working label Dec 1, 2025
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 1, 2025

Greptile Overview

Greptile Summary

This PR fixes a critical bug where OpenAI reasoning models (o1, o3, gpt-5 series excluding gpt-5-chat) fail with BadRequestError when NeMo Guardrails uses .bind(temperature=X) for deterministic tasks. The fix adds _filter_params_for_openai_reasoning_models() that detects reasoning models by name and removes the temperature parameter before binding, since these models only support temperature=1.

Key Changes:

  • Added filtering function in nemoguardrails/actions/llm/utils.py:139-165 that removes temperature from params for reasoning models
  • Applied filter in llm_call() function before binding parameters to the LLM
  • Comprehensive test coverage with 13 parametrized test cases covering reasoning models, regular models, and edge cases
  • Uses case-insensitive model name detection following LangChain's pattern

Issues Found:

  • The fix only applies to the llm_call() function path, but there are 3 other locations in the codebase (nemoguardrails/library/hallucination/actions.py:82, nemoguardrails/evaluate/evaluate_factcheck.py:98, nemoguardrails/evaluate/evaluate_hallucination.py:74) that call .bind() directly with temperature values, which will still fail with reasoning models
  • These other callsites should either use the new filtering function or route through llm_call() to get the protection

Confidence Score: 3/5

  • This PR partially fixes the issue but leaves gaps in other code paths that will still fail with reasoning models
  • The implementation correctly fixes the llm_call() path with proper filtering logic and excellent test coverage. However, there are 3 other locations in the codebase that use .bind() with temperature parameters that will still cause the same BadRequestError with reasoning models. The fix is incomplete until these other callsites are also protected.
  • Pay close attention to nemoguardrails/library/hallucination/actions.py, nemoguardrails/evaluate/evaluate_factcheck.py, and nemoguardrails/evaluate/evaluate_hallucination.py which have unprotected .bind() calls with temperature

Important Files Changed

File Analysis

Filename Score Overview
nemoguardrails/actions/llm/utils.py 4/5 Added _filter_params_for_openai_reasoning_models() to remove temperature param for o1/o3/gpt-5 models before binding, fixing API errors
tests/test_actions_llm_utils.py 5/5 Comprehensive test coverage with 13 test cases covering reasoning models (o1, o3, gpt-5*), regular models (gpt-4*), and edge cases

Sequence Diagram

sequenceDiagram
    participant User
    participant LLMCall as llm_call()
    participant Filter as _filter_params_for_openai_reasoning_models()
    participant LLM as LangChain LLM
    participant API as OpenAI API

    User->>LLMCall: call with llm_params={temperature: 0.001}
    LLMCall->>Filter: filter params for reasoning model
    Filter->>Filter: check if model is o1/o3/gpt-5*
    alt is reasoning model
        Filter->>Filter: remove temperature parameter
        Filter-->>LLMCall: {max_tokens: ...} (temp removed)
    else not reasoning model
        Filter-->>LLMCall: {temperature: 0.001, max_tokens: ...}
    end
    LLMCall->>LLM: llm.bind(**filtered_params)
    LLM->>API: API call with allowed params
    API-->>LLM: success response
    LLM-->>LLMCall: response
    LLMCall-->>User: generated text
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (3)

  1. nemoguardrails/library/hallucination/actions.py, line 82 (link)

    logic: this .bind(temperature=1.0) call will fail with OpenAI reasoning models since it bypasses the _filter_params_for_openai_reasoning_models() function. Consider importing and using the filter function here or calling through llm_call()

  2. nemoguardrails/evaluate/evaluate_factcheck.py, line 98 (link)

    logic: this .bind(temperature=0.8) call will fail with OpenAI reasoning models. Consider importing and using _filter_params_for_openai_reasoning_models() function

  3. nemoguardrails/evaluate/evaluate_hallucination.py, line 74 (link)

    logic: if llm_params contains temperature, this .bind() call will fail with OpenAI reasoning models. Consider importing and using _filter_params_for_openai_reasoning_models() function

2 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +154 to +158
is_openai_reasoning_model = (
model_name.startswith("o1")
or model_name.startswith("o3")
or (model_name.startswith("gpt-5") and "chat" not in model_name)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: verify this detection logic handles future OpenAI reasoning models (e.g., gpt-6 or o4 series) that may have similar temperature restrictions

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/actions/llm/utils.py
Line: 154:158

Comment:
**style:** verify this detection logic handles future OpenAI reasoning models (e.g., gpt-6 or o4 series) that may have similar temperature restrictions

How can I resolve this? If you propose a fix, please make it concise.

@codecov
Copy link

codecov bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: BadRequestError as gpt-5 only supports temperature=1

2 participants