Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 15, 2025

📄 38% (0.38x) speedup for hyperliquid.parse_ohlcv in python/ccxt/async_support/hyperliquid.py

⏱️ Runtime : 81.0 microseconds 58.6 microseconds (best of 132 runs)

📝 Explanation and details

The optimized code achieves a 38% speedup through three key optimizations that reduce function call overhead and eliminate unnecessary operations:

1. Streamlined safe_integer with fast-path optimizations:

  • Replaced the expensive Exchange.key_exists() function call with direct dictionary access using try-catch, eliminating 75% of the original function's overhead
  • Added early return for integer values to skip unnecessary float() conversion
  • Combined exception handling to reduce branching overhead
  • Added explicit checks for None and empty string values before type conversion

2. Inlined safe_number to eliminate function call overhead:

  • Removed the intermediate safe_string() call that was consuming 63.9% of safe_number's execution time
  • Direct dictionary lookup with try-catch exception handling
  • Reduced from two function calls per field to just the final parse_number() call

3. Method reference caching in parse_ohlcv:

  • Stored self.safe_integer and self.safe_number as local variables to eliminate repeated attribute lookups during list construction
  • Reduces 6 method attribute lookups per OHLCV parsing operation

Performance Impact:
The optimizations are particularly effective for the test cases showing:

  • 55.5% speedup for typical OHLCV data with mixed numeric/string values
  • 74.5% speedup for null value handling (common in incomplete market data)
  • 51.3% speedup for large number processing

These optimizations are especially valuable in high-frequency trading scenarios where OHLCV parsing happens thousands of times per second. The changes maintain full backward compatibility while significantly reducing CPU overhead in data-intensive operations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 33 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from ccxt.async_support.hyperliquid import hyperliquid

# -------------------------- UNIT TESTS --------------------------

@pytest.fixture
def hl():
    # Fixture for the hyperliquid instance
    return hyperliquid()

# ========== BASIC TEST CASES ==========

def test_basic_typical_ohlcv(hl):
    # Typical OHLCV dictionary with all string values
    ohlcv = {
        "T": 1704287699999,
        "c": "2226.4",
        "h": "2247.9",
        "i": "15m",
        "l": "2224.6",
        "n": 46,
        "o": "2247.9",
        "s": "ETH",
        "t": 1704286800000,
        "v": "591.6427"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 6.93μs -> 4.46μs (55.5% faster)

def test_basic_numeric_and_string_mix(hl):
    # Some fields as numbers, some as strings
    ohlcv = {
        "t": 1234567890,
        "o": 100.5,
        "h": "101.0",
        "l": 99.0,
        "c": "100.9",
        "v": 50
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 9.86μs -> 7.03μs (40.3% faster)

def test_basic_minimal_valid(hl):
    # Only the required keys, all as strings
    ohlcv = {
        "t": "1",
        "o": "2",
        "h": "3",
        "l": "4",
        "c": "5",
        "v": "6"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 6.22μs -> 4.54μs (37.0% faster)

# ========== EDGE TEST CASES ==========

def test_missing_fields(hl):
    # Some fields are missing, should return None in their place
    ohlcv = {
        "t": 123,
        "o": "10.0",
        # "h" missing
        "l": "8.0",
        # "c" missing
        "v": "100"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 7.23μs -> 5.33μs (35.6% faster)

def test_extra_fields_ignored(hl):
    # Extra fields should be ignored
    ohlcv = {
        "t": 1,
        "o": "2",
        "h": "3",
        "l": "4",
        "c": "5",
        "v": "6",
        "extra": "should be ignored"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 6.31μs -> 4.39μs (43.8% faster)

def test_unparseable_numbers(hl):
    # Unparseable values should result in None
    ohlcv = {
        "t": "abc",  # not an int
        "o": "def",  # not a float
        "h": "3.3",
        "l": None,   # None
        "c": "NaN",  # not a number
        "v": "100"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 8.76μs -> 7.03μs (24.6% faster)

def test_zero_and_negative_values(hl):
    # Zero and negative values
    ohlcv = {
        "t": 0,
        "o": "-1.23",
        "h": "0",
        "l": "-5.67",
        "c": "0.0",
        "v": "-100"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 6.56μs -> 4.66μs (40.8% faster)

def test_large_numbers(hl):
    # Very large numbers
    ohlcv = {
        "t": 999999999999999,
        "o": "1e10",
        "h": "2e10",
        "l": "1e9",
        "c": "1.5e10",
        "v": "1e8"
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 6.83μs -> 4.51μs (51.3% faster)

def test_empty_dict(hl):
    # Empty dict: all fields missing, should all be None
    ohlcv = {}
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 5.29μs -> 3.84μs (37.7% faster)

def test_null_values(hl):
    # All fields are None
    ohlcv = {
        "t": None,
        "o": None,
        "h": None,
        "l": None,
        "c": None,
        "v": None
    }
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 4.38μs -> 2.51μs (74.5% faster)


def test_incorrect_types(hl):
    # Dict with wrong types (lists, dicts, bools)
    ohlcv = {
        "t": [1],
        "o": {"x": 1},
        "h": True,
        "l": False,
        "c": "10",
        "v": None
    }
    # Only "c" should parse, others should be None
    codeflash_output = hl.parse_ohlcv(ohlcv); result = codeflash_output # 12.6μs -> 10.3μs (21.6% faster)

# ========== LARGE SCALE TEST CASES ==========

def test_large_scale_many_ohlcvs(hl):
    # Test with a list of 1000 ohlcv dicts, all valid
    base_ohlcv = {
        "t": 1000,
        "o": "1.1",
        "h": "2.2",
        "l": "0.9",
        "c": "1.9",
        "v": "100"
    }
    many_ohlcvs = [dict(base_ohlcv, t=1000+i, o=str(1.1+i), h=str(2.2+i), l=str(0.9-i), c=str(1.9-i), v=str(100+i)) for i in range(1000)]
    results = [hl.parse_ohlcv(ohlcv) for ohlcv in many_ohlcvs]

def test_large_scale_some_invalid(hl):
    # Mix valid and invalid ohlcv dicts in a large list
    valid = {
        "t": 1,
        "o": "1",
        "h": "2",
        "l": "0.5",
        "c": "1.5",
        "v": "10"
    }
    invalid = {
        "t": "bad",
        "o": "bad",
        "h": None,
        "l": "bad",
        "c": None,
        "v": "bad"
    }
    ohlcvs = [valid if i % 2 == 0 else invalid for i in range(1000)]
    results = [hl.parse_ohlcv(ohlcv) for ohlcv in ohlcvs]
    for i, r in enumerate(results):
        if i % 2 == 0:
            pass
        else:
            pass

def test_large_scale_performance(hl):
    # Test that 1000 parses complete quickly and without error
    import time
    base_ohlcv = {
        "t": 123456,
        "o": "10",
        "h": "20",
        "l": "5",
        "c": "15",
        "v": "1000"
    }
    ohlcvs = [dict(base_ohlcv, t=123456+i) for i in range(1000)]
    start = time.time()
    results = [hl.parse_ohlcv(ohlcv) for ohlcv in ohlcvs]
    duration = time.time() - start
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from ccxt.async_support.hyperliquid import hyperliquid

# --- Function under test (minimal implementation for test context) ---
# We define a minimal class with the parse_ohlcv method as per the reference implementation.

class Hyperliquid:
    def safe_integer(self, dictionary, key, default_value=None):
        if key not in dictionary or dictionary[key] is None or dictionary[key] == '':
            return default_value
        value = dictionary[key]
        try:
            return int(float(value))
        except (ValueError, TypeError):
            return default_value

    def safe_string(self, dictionary, key, default_value=None):
        if key not in dictionary or dictionary[key] is None or dictionary[key] == '':
            return default_value
        return str(dictionary[key])

    def parse_number(self, value, default=None):
        if value is None:
            return default
        try:
            return float(value)
        except Exception:
            return default

    def safe_number(self, obj, key, defaultNumber=None):
        value = self.safe_string(obj, key)
        return self.parse_number(value, defaultNumber)

    def parse_ohlcv(self, ohlcv, market=None):
        return [
            self.safe_integer(ohlcv, 't'),
            self.safe_number(ohlcv, 'o'),
            self.safe_number(ohlcv, 'h'),
            self.safe_number(ohlcv, 'l'),
            self.safe_number(ohlcv, 'c'),
            self.safe_number(ohlcv, 'v'),
        ]
from ccxt.async_support.hyperliquid import hyperliquid

# -------------------- BASIC TEST CASES --------------------
























To edit these changes git checkout codeflash/optimize-hyperliquid.parse_ohlcv-mhzrxnd0 and push.

Codeflash

The optimized code achieves a **38% speedup** through three key optimizations that reduce function call overhead and eliminate unnecessary operations:

**1. Streamlined `safe_integer` with fast-path optimizations:**
- Replaced the expensive `Exchange.key_exists()` function call with direct dictionary access using try-catch, eliminating 75% of the original function's overhead
- Added early return for integer values to skip unnecessary `float()` conversion
- Combined exception handling to reduce branching overhead
- Added explicit checks for `None` and empty string values before type conversion

**2. Inlined `safe_number` to eliminate function call overhead:**
- Removed the intermediate `safe_string()` call that was consuming 63.9% of `safe_number`'s execution time
- Direct dictionary lookup with try-catch exception handling
- Reduced from two function calls per field to just the final `parse_number()` call

**3. Method reference caching in `parse_ohlcv`:**
- Stored `self.safe_integer` and `self.safe_number` as local variables to eliminate repeated attribute lookups during list construction
- Reduces 6 method attribute lookups per OHLCV parsing operation

**Performance Impact:**
The optimizations are particularly effective for the test cases showing:
- **55.5% speedup** for typical OHLCV data with mixed numeric/string values
- **74.5% speedup** for null value handling (common in incomplete market data)
- **51.3% speedup** for large number processing

These optimizations are especially valuable in high-frequency trading scenarios where OHLCV parsing happens thousands of times per second. The changes maintain full backward compatibility while significantly reducing CPU overhead in data-intensive operations.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 15, 2025 04:15
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Nov 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant