⚡️ Speed up method TextStreamer.__anext__ by 15%
#451
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 15% (0.15x) speedup for
TextStreamer.__anext__inlitellm/llms/vertex_ai/vertex_ai_non_gemini.py⏱️ Runtime :
1.23 milliseconds→1.07 milliseconds(best of90runs)📝 Explanation and details
The optimization applies two key micro-optimizations to the async iterator's hot path:
Key Changes:
len(self.text)asself._lenduring initialization to eliminate repeatedlen()function callsidxto cacheself.indexand reduce attribute lookups in the critical pathPerformance Impact:
The optimization achieves a 15% runtime improvement (1.23ms → 1.07ms) by reducing overhead in the
__anext__method. In Python, attribute access (self.index) is slower than local variable access (idx) because it requires dictionary lookups in the object's__dict__. The pre-computed length also eliminates the overhead of callinglen()on each iteration.Workload Benefits:
This optimization is particularly effective for:
__anext__is called repeatedlyThe test results show the optimization maintains correctness across all scenarios (single words, empty strings, concurrent access, large inputs) while providing consistent speedup. The micro-optimizations are most beneficial in tight loops or high-concurrency async contexts where the
__anext__method becomes a performance bottleneck.Note: While individual runtime improved by 15%, the slight throughput decrease (-1.1%) suggests the optimization may have different effects under concurrent load, though the overall performance gain in sequential access patterns makes this a worthwhile improvement.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-TextStreamer.__anext__-mhx4abvuand push.