Skip to content

Conversation

@seddonym
Copy link
Collaborator

@seddonym seddonym commented Aug 22, 2025

This PR introduces a couple of optimizations for scanning very large code bases, based on running samply on a large closed-source code base. On my laptop it takes the time building the graph from around 4s to 3s.

It was inefficient to scan through all the found packages each time.
@codspeed-hq
Copy link

codspeed-hq bot commented Aug 22, 2025

CodSpeed Instrumentation Performance Report

Merging #242 will improve performances by 27.65%

Comparing scanning-optimization (b53a0ed) with main (39c5c0a)

Summary

⚡ 5 improvements
✅ 18 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_deep_layers_large_graph_kept 20.1 ms 17.2 ms +16.84%
test_no_chain 1.2 ms 1.1 ms +12.65%
test_no_chains 1.2 ms 1.1 ms +12.64%
test_build_django_from_cache_a_few_misses[350] 742.2 ms 671.1 ms +10.6%
test_build_django_uncached 727.7 ms 570.1 ms +27.65%

@seddonym seddonym changed the title Build a map of modules Scanning optimization Aug 29, 2025
@seddonym seddonym marked this pull request as ready for review August 29, 2025 18:02
@seddonym seddonym requested a review from Peter554 August 29, 2025 18:02
rust/Cargo.toml Outdated
serde_yaml = "0.9"
unindent = "0.2.4"
encoding_rs = "0.8.35"
once_cell = "1.21.3"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add once_cell? Can we use lazy_static (already a project dependency) or even std::cell::OnceCell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I've switched to lazy_static for consistency.

Copy link
Collaborator

@Peter554 Peter554 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment, but otherwise looks good to me, thanks

We were compiling this in each loop. On a large codebase, this
contributed to ~10% of the time taken to build the graph.
@seddonym seddonym force-pushed the scanning-optimization branch from a9d5381 to b53a0ed Compare August 30, 2025 14:57
@seddonym seddonym merged commit d650ad3 into main Aug 30, 2025
18 checks passed
@seddonym seddonym deleted the scanning-optimization branch August 30, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants