Skip to content

Conversation

@Peter554
Copy link
Collaborator

@Peter554 Peter554 commented Aug 6, 2025

This PR improves the performance of the find_shortest_chains function by refactoring the underlying implementation to use a more efficient algorithm. The original implementation used a cartesian product of all possible module pairs, which resulted in an O(n²) complexity and poor performance on large graphs.

Changes:

  1. Extracted common chain-finding logic into a new _find_shortest_chains helper function that can be reused across different queries
  2. Replaced the duplicated code in find_illegal_dependencies with calls to the new helper function
  3. Optimized find_shortest_chains to use the same algorithm as find_illegal_dependencies, resulting in significantly better performance in realistic scenarios.

This optimization is particularly beneficial for large dependency graphs where the O(n²) behavior of the previous implementation could cause performance bottlenecks.

@codspeed-hq
Copy link

codspeed-hq bot commented Aug 6, 2025

CodSpeed Instrumentation Performance Report

Merging #238 will degrade performances by 53.36%

Comparing Peter554:improve-find-shortest-chains (d57385e) with main (3176007)

Summary

⚡ 1 improvements
❌ 5 (👁 5) regressions
✅ 17 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
👁 test_deep_layers_large_graph_kept 16.8 ms 19.8 ms -15.21%
👁 test_deep_layers_large_graph_violated 10.5 ms 11.8 ms -11.73%
👁 test_no_chain 1.1 ms 1.2 ms -11.31%
👁 test_chains_found 82.4 µs 176.7 µs -53.36%
test_chains_found_sparse_imports 144,076.2 µs 375.1 µs ×380
👁 test_no_chains 1.1 ms 1.2 ms -11.35%

@Peter554 Peter554 changed the title Improve find shortest chains Optimize find_shortest_chains Performance Aug 6, 2025
@Peter554 Peter554 changed the title Optimize find_shortest_chains Performance Optimize find_shortest_chains Aug 6, 2025
@Peter554 Peter554 force-pushed the improve-find-shortest-chains branch 2 times, most recently from dc83c71 to 3a7775c Compare August 6, 2025 09:33
@seddonym seddonym self-requested a review August 8, 2025 11:05
Copy link
Collaborator

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this!

It's great to see the improvement for no chains found, but there are several other things which are apparently slower. What are you thoughts? Possibly one for a verbal discussion.


* Add closed layers to layer contract.
* Rename default repository branch to 'main'.
* Optimise `find_shortest_chains` query.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that it might lead to slightly different results in some cases? (I agree with you, the results of find_shortest_chains is not formally specified, though I'd like us to get to that point.)

@Peter554
Copy link
Collaborator Author

Peter554 commented Aug 8, 2025

there are several other things which are apparently slower. What are you thoughts?

@seddonym I think we should merge #239 first, then the improvement will be much clearer. I believe find_shortest_chains is 100's of times faster after this change, but need that new benchmark to prove it.

@Peter554 Peter554 force-pushed the improve-find-shortest-chains branch from 3a7775c to acb4720 Compare August 8, 2025 20:04
…s utility

This changes the algorithm used by find_shortest_chains.
The algorithm now matches that used by find_illegal_dependencies.
This algorithm is much more performant (see change in benchmark).

Neither the previous or changed algorithm are fully exhaustive,
so I don't think this counts as a breaking change. The algorithm
used by find_shortest_chains was never formally specified, I believe.

All test cases still pass, so the behaviour is practically near-unchanged.
@Peter554 Peter554 force-pushed the improve-find-shortest-chains branch from acb4720 to d57385e Compare August 9, 2025 09:04
@Peter554 Peter554 requested a review from seddonym August 9, 2025 09:19
Copy link
Collaborator

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing speed up! 👏🏻

I was a bit concerned about the slowdowns in some of the benchmarks, but have run this branch on a very large code base and it doesn't seem to cause a noticeable slowdown - and makes one problematic contract much, much faster.

// We'll add chains to this set as we discover them.
let mut excluded_imports = FxHashMap::default();

loop {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that in a long-running call we can't exit using CMD-C or equivalent, might be a nice improvement in another PR to listen for that somehow.

@seddonym seddonym merged commit 4713d73 into python-grimp:main Aug 11, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants