⚡️ Speed up method Text2VecEmbeddingFunction.build_from_config by 10%
#128
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 10% (0.10x) speedup for
Text2VecEmbeddingFunction.build_from_configinchromadb/utils/embedding_functions/text2vec_embedding_function.py⏱️ Runtime :
120 microseconds→110 microseconds(best of54runs)📝 Explanation and details
The optimized code implements model caching to avoid repeatedly loading expensive SentenceModel instances and improves import error handling. The key changes are:
What was optimized:
Class-level model caching: Instead of creating a new SentenceModel for each instance, models are cached at the class level using
_model_cache. When the samemodel_nameis used multiple times, the cached model is reused rather than reloaded.Improved import checking: Replaced the try/except ImportError pattern with
importlib.util.find_spec()to check for package availability before importing, which is more explicit and potentially faster for repeated checks.Why this leads to speedup:
importlib.util.find_spec()avoids the exception handling overhead of ImportError when the package is missing.Test case performance:
The optimization particularly benefits scenarios where:
model_name(large scale tests with repeated model names)The 9% speedup in the profiled code comes primarily from the more efficient model initialization path, even though the specific test focuses on the
build_from_configmethod rather than heavy model reuse scenarios.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
🔎 Concolic Coverage Tests and Runtime
codeflash_concolic_p_g0hne0/tmp4yiorqjm/test_concolic_coverage.py::test_Text2VecEmbeddingFunction_build_from_configcodeflash_concolic_p_g0hne0/tmp4yiorqjm/test_concolic_coverage.py::test_Text2VecEmbeddingFunction_build_from_config_2To edit these changes
git checkout codeflash/optimize-Text2VecEmbeddingFunction.build_from_config-mh7maa0wand push.