More benchmarks #177

Peter554 · 2025-01-18T11:37:48Z

This PR adds some more benchmarks. I'm making some progress on the rust graph here, but it would be helpful to have a few more benchmarks in place so that we can measure more things. Currently we only have building the graph and illegal layers - there's quite a lot of ground inbetween and the gap feels a bit daunting!

It also adds a fix for the existing benchmarks, when building the large graph.

pytest-benchmark benchmark() function returns the results, so we can assert within the same test.

Peter554 · 2025-01-18T11:45:28Z

tests/benchmarking/test_benchmarking.py

+
+def test_find_shortest_chain(large_graph, benchmark):
+    result = benchmark(large_graph.find_shortest_chain, DEEP_LAYERS[0], DEEP_LAYERS[1])
+    assert result is not None and len(result) == 5


I think we don't need to assert against the exact result here. This file is intended for benchmarks, not full tests, so a quick sanity check seems fine to me.

Another option could be to use snapshots, so that we can assert the exact results but they are not cluttering the benchmarks https://github.com/syrupy-project/syrupy

It would be helpful to have it somewhere, at some point (to check feature parity between this and the Rust graph implementation) but I take your point, it could go somewhere else. Don't feel strongly about syrupy either way.

I tried asserting against the exact result here, but it doesn't seem possible - find_shortest_chain/find_shortest_chains seem to return different results between runs? I guess there must be multiple chains of equal length?

Peter554 · 2025-01-18T11:55:32Z

tests/benchmarking/test_benchmarking.py

            containers=("mypackage",),
        )
    )
-    assert result == {


♻️ #177 (comment)

I'd rather we didn't remove these - we don't actually assert on the exact results of a large graph like this anywhere else and it does give me more confidence that things are happening correctly. Happy for it to be moved out of benchmarks if you feel strongly, but maybe we should leave that for a separate PR.

Okay - put this back

codspeed-hq · 2025-01-18T11:55:49Z

CodSpeed Performance Report

Merging #177 will degrade performances by 66.81%

_{Comparing Peter554:more-benchmarks (455fed2) with master (957ff81)}

Summary

❌ 2 (👁 2) regressions
✅ 2 untouched benchmarks
🆕 7 new benchmarks

Benchmarks breakdown

	Benchmark	`master`	`Peter554:more-benchmarks`	Change
🆕	`test_chain_found`	N/A	67.5 µs	N/A
🆕	`test_no_chain`	N/A	39.8 µs	N/A
🆕	`test_chains_found`	N/A	128 ms	N/A
🆕	`test_no_chains`	N/A	128.7 ms	N/A
👁	`test_deep_layers_large_graph`	1.3 s	1.5 s	-14.18%
🆕	`test_find_descendants`	N/A	72.6 ms	N/A
🆕	`test_find_downstream_modules`	N/A	2.5 s	N/A
🆕	`test_find_upstream_modules`	N/A	1.8 s	N/A
👁	`test_top_level_large_graph`	260.8 ms	785.8 ms	-66.81%

seddonym

Thanks for looking at this, great addition.

For some reason the layer checks have become a lot slower though, any idea why? Maybe it's worth separating out commits adding the new benchmarks so we can add ASAP, then keep the other changes separate. Maybe it's a blip.

seddonym · 2025-01-18T12:58:59Z

tests/benchmarking/test_benchmarking.py

    graph = ImportGraph()

    for importer, importeds in graph_dict.items():
+        graph.add_module(importer)


Do we know why this changes the result? Adding the import should add any modules as per https://grimp.readthedocs.io/en/stable/usage.html#ImportGraph.add_import

To me this looks like we may have found a bug.

I'm not entirely sure I understand why this changed the results and caused the performance slowdown.

One thing that is happening - the graph is now bigger. E.g. consider

{ "pkg.a": ["pkg.b"], "pkg.c": [] }

In the above JSON, the pkg.c module would not have been added before (since it has no imports, and nothing is importing it).

I've reordered the commits, so that the changes caused e.g. in number of modules becomes more obvious now.

Even though I don't fully understand why this happens entirely, I am entirely confident in this change to the benchmarks file. I.e. we certainly should be calling graph.add_module(importer), and the import path that we now find is correct (agrees with pyimports). So I think there's no harm to avoid merging this.

Okay I understand it now.

result = large_graph.find_illegal_dependencies_for_layers( layers=( # mypackage.foo and mypackage.bar do not exist!!! "foo", "bar", ), containers=("mypackage",), ) assert result == set()

The above passes ☝️ So it looks like find_illegal_dependencies_for_layers does not error if the layer modules cannot be found in the graph. Presumably this is by design?

If we look at large graph JSON we have e.g.

"mypackage.domain": [],

This is the only occurence of "mypackage.domain" in that file. So - mypackage.domain is not importing anything, and nothing is importing it - hence the module is not added to the graph.

So sadly the test_top_level_large_graph was not doing anything before 😅 Hence why it is now slower - because it actually does some work!

Ha ok, makes sense.

The silent failure is by design only to the point of feature parity, but I think it would be worth changing it at some point. Not now though.

seddonym · 2025-01-18T13:02:09Z

tests/benchmarking/test_benchmarking.py

            containers=("mypackage",),
        )
    )
-    assert result == {


I'd rather we didn't remove these - we don't actually assert on the exact results of a large graph like this anywhere else and it does give me more confidence that things are happening correctly. Happy for it to be moved out of benchmarks if you feel strongly, but maybe we should leave that for a separate PR.

seddonym · 2025-01-18T13:04:11Z

tests/benchmarking/test_benchmarking.py

        benchmark(fn)


+def test_find_descendants(large_graph, benchmark):


Might be worth using benchmark.pedantic for each of these, like with the others, so we get to average out the runs. That will only apply to a local run.

Done it.

🐼 I'm not fully sure what this does. Even without pedantic pytest-benchmark will run many iterations and take an average. What's the difference between between iterations and rounds here?

seddonym · 2025-01-18T13:05:43Z

tests/benchmarking/test_benchmarking.py

+
+def test_find_shortest_chain(large_graph, benchmark):
+    result = benchmark(large_graph.find_shortest_chain, DEEP_LAYERS[0], DEEP_LAYERS[1])
+    assert result is not None and len(result) == 5


It would be helpful to have it somewhere, at some point (to check feature parity between this and the Rust graph implementation) but I take your point, it could go somewhere else. Don't feel strongly about syrupy either way.

seddonym · 2025-01-18T13:09:12Z

tests/benchmarking/test_benchmarking.py

+    assert result is None
+
+
+def test_find_shortest_chains(large_graph, benchmark):


Mind grouping these tests for the same method into a TestFindShortestChains class?

Peter554 · 2025-01-18T14:57:52Z

tests/benchmarking/test_benchmarking.py

 def test_find_descendants(large_graph, benchmark):
    result = _run_benchmark(benchmark, large_graph.find_descendants, "mypackage")
-    assert len(result) == 17348
+    assert len(result) == 28222


FYI @seddonym - see here that fixing the graph increases the number of modules. I.e. we were missing many modules before.

Peter554 · 2025-01-18T14:58:32Z

tests/benchmarking/test_benchmarking.py

        )
        assert result is not None

-    @pytest.mark.xfail("grimp.exceptions.ModuleNotPresent")


FYI @seddonym - see here that fixing the graph means that the module mypackage.data.vendors.4053192739.6373932949 now exists, as it should!

We should still add modules, even if they have no imports.

seddonym

Great stuff

Assert result within benchmarks directly

cc9f49b

pytest-benchmark benchmark() function returns the results, so we can assert within the same test.

Peter554 force-pushed the more-benchmarks branch from 7905447 to 7c43ee4 Compare January 18, 2025 11:42

Peter554 commented Jan 18, 2025

View reviewed changes

Peter554 force-pushed the more-benchmarks branch 3 times, most recently from 73e089a to 303e910 Compare January 18, 2025 11:55

Peter554 commented Jan 18, 2025

View reviewed changes

seddonym reviewed Jan 18, 2025

View reviewed changes

Add more benchmarks

6b2c796

Peter554 force-pushed the more-benchmarks branch from 303e910 to e8c064a Compare January 18, 2025 14:48

Peter554 commented Jan 18, 2025

View reviewed changes

Peter554 requested a review from seddonym January 18, 2025 14:58

Peter554 added 2 commits January 18, 2025 16:12

Add helper to run benchmark local with pendantic vs codspeed

9657d65

Fix bug when building large graph for benchmarks

455fed2

We should still add modules, even if they have no imports.

Peter554 force-pushed the more-benchmarks branch from e8c064a to 455fed2 Compare January 18, 2025 15:13

seddonym approved these changes Jan 18, 2025

View reviewed changes

seddonym merged commit 6523dfe into python-grimp:master Jan 18, 2025
17 checks passed

Peter554 deleted the more-benchmarks branch February 7, 2025 08:32

		benchmark(fn)


		def test_find_descendants(large_graph, benchmark):

		assert result is None


		def test_find_shortest_chains(large_graph, benchmark):

More benchmarks #177

More benchmarks #177

Uh oh!

Conversation

Peter554 commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Peter554 Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Peter554 Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #177 will degrade performances by 66.81%

Summary

Benchmarks breakdown

Uh oh!

seddonym left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Peter554 Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Peter554 Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Peter554 Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seddonym left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Peter554 commented Jan 18, 2025 •

edited

Loading

Peter554 Jan 18, 2025 •

edited

Loading

Peter554 Jan 18, 2025 •

edited

Loading

codspeed-hq bot commented Jan 18, 2025 •

edited

Loading

Peter554 Jan 18, 2025 •

edited

Loading

Peter554 Jan 18, 2025 •

edited

Loading

Peter554 Jan 18, 2025 •

edited

Loading