⚡️ Speed up function get_start_span_function by 18%#57
Open
codeflash-ai[bot] wants to merge 1 commit intomasterfrom
Open
⚡️ Speed up function get_start_span_function by 18%#57codeflash-ai[bot] wants to merge 1 commit intomasterfrom
get_start_span_function by 18%#57codeflash-ai[bot] wants to merge 1 commit intomasterfrom
Conversation
The optimization replaces the indirect function call in `get_current_span()` with direct inline logic, eliminating function call overhead and reducing stack depth. **What changed:** - **Removed function indirection**: Instead of calling `tracing_utils.get_current_span(scope)`, the optimized version directly implements the same logic inline - **Eliminated redundant scope resolution**: The original version had two levels of scope resolution (once in `api.py` → `tracing_utils.py`, then again internally), while the optimized version does it once **Why it's faster:** - **Reduced call stack depth**: Eliminates the intermediate function call to `tracing_utils.get_current_span()`, saving function call overhead (~100-200ns per call) - **Direct attribute access**: `scope.span` is accessed directly instead of going through another function that does the same operation - **Better CPU cache efficiency**: Fewer function boundaries means better instruction cache locality **Performance characteristics:** The optimization shows **17% overall speedup** and is particularly effective for: - High-frequency span checking scenarios (evident in large-scale test cases showing 17-21% improvements) - Applications that frequently call `get_current_span()` indirectly through `get_start_span_function()` - Cases where scope resolution happens repeatedly in tight loops The line profiler shows the `get_current_span` function time dropped from 16.86ms to 15.29ms, with the optimization eliminating the function call overhead that was the primary bottleneck in this hot path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 18% (0.18x) speedup for
get_start_span_functioninsentry_sdk/ai/utils.py⏱️ Runtime :
4.22 milliseconds→3.59 milliseconds(best of146runs)📝 Explanation and details
The optimization replaces the indirect function call in
get_current_span()with direct inline logic, eliminating function call overhead and reducing stack depth.What changed:
tracing_utils.get_current_span(scope), the optimized version directly implements the same logic inlineapi.py→tracing_utils.py, then again internally), while the optimized version does it onceWhy it's faster:
tracing_utils.get_current_span(), saving function call overhead (~100-200ns per call)scope.spanis accessed directly instead of going through another function that does the same operationPerformance characteristics:
The optimization shows 17% overall speedup and is particularly effective for:
get_current_span()indirectly throughget_start_span_function()The line profiler shows the
get_current_spanfunction time dropped from 16.86ms to 15.29ms, with the optimization eliminating the function call overhead that was the primary bottleneck in this hot path.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
🔎 Concolic Coverage Tests and Runtime
codeflash_concolic_j2vbhl1v/tmpy0ye3j9e/test_concolic_coverage.py::test_get_start_span_functionTo edit these changes
git checkout codeflash/optimize-get_start_span_function-mg9uw59iand push.