Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ“Š Adds automatic benchmarking #241

Merged
merged 6 commits into from
Oct 20, 2021
Merged

πŸ“Š Adds automatic benchmarking #241

merged 6 commits into from
Oct 20, 2021

Conversation

tomrijnbeek
Copy link
Member

✨ What's this?

This PR adds automatic benchmarking to our PR, and demonstrates it by adding some simple benchmarks of our Linq extensions.

πŸ”— Relationships

Requirement to complete #183

πŸ” Why do we want this?

This library is specifically aiming to provide high performance code to use in game development. Benchmarks help us quantify this claim, and can catch unexpected performance regressions.

πŸ— How is it done?

The benchmarks themselves are run by Benchmark.NET, which seems to be the biggest benchmarking library out there.

The GitHub workflow is built on this action. Currently it is a fork of a different action which is currently inactive. Not a great state to be in, but this appears to be the only way to actually get performance comparisons between a baseline and PR version.

The workflow is executed on each PR, as we do with tests. The latest results for the master branch are stored using the cache action (and you see that the final step is to copy our results back into the cache location, but only if we are on the master branch). If the performance gets worse significantly, a warning comment is added to the PR.

πŸ’₯ Breaking changes

N/A

πŸ”¬ Why not another way?

There are a few other possibilities:

  • Don't use the action that parses the benchmark, and use the built-in Benchmark.NET Markdown exporter. The Markdown can then be uploaded as an artefact of the action, and can be made visible that way. It would require manual inspection whenever we change performance critical code, so making it more proactive sounded like a good deal. However, if things turn out not to work that well, I think this is a great alternative choice.
  • Only run benchmarks on the master branch. This would basically only keep track of ongoing performance, and export them to our documentation. This didn't seem particularly useful.
  • The GitHub workflow isn't particularly tamperproof. It is fairly straightforward to remove the "if" from the cache persisting action, and taint the cache with wrong results. Changing this would require us to set up a separate GitHub workflow that always gets read from the master branch, so that local PR changes do not change it. This complication was considered not worth it at this time. I believe it will take us some time to find the right way to integrate benchmarks into our workflows, so let's start simple and iterate as needed.

πŸ¦‹ Side effects

N/A

πŸ’‘ Review hints

None.

[SuppressMessage("ReSharper", "ClassCanBeSealed.Global")]
public sealed class Extensions
{
private const int seed = 1337;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could potentially lead to inconsistent results if the implementation of Random changes. Did you try just not using a seed to average over a variety of different seeds? (Assuming seeds of Random objects created quickly after each other are random enough - otherwise we could consider generating them with the static random class...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using a fixed seed is desirable. You are right that this could lead to inconsistent results if the implementation of Random changes. However, the implementation of Random changing must be because we update a dependency or runtime. In that case, the results being inconsistent in itself becomes an important signal. If we see a significant performance reduction because we update or .NET version, then that is worth finding.

By using a variety of different seeds, we surely average out things correctly, but all of a sudden we have to start tuning our benchmarks to make them (1) as fast as possible while at the same time make them (2) run often enough to balance out the statistical noise from the random.

Using a fixed seed is what is used across the board in the examples in the BenchmarkDotNet documentation. I believe my argument above justifies following that example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library examples themselves using static seed surprises me, but perhaps there is indeed a reason... but given that it reruns the test many times, it would still make more sense, at least in principle, to use random seeds to find out NOW if the performance in inconsistent (doesn't the library give us variance measurement?), instead of waiting for a runtime update to surprise us. :D

Copy link
Member

@paulcscharf paulcscharf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to take this for now, because it's a good thing, and we can look further into the random stuff another time.

@paulcscharf paulcscharf merged commit 012e785 into master Oct 20, 2021
@paulcscharf paulcscharf deleted the benchmarks branch October 20, 2021 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants