Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parlot to JsonBench #96

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lahma
Copy link

@lahma lahma commented Jan 19, 2021

Parlot is a new parser combinator library by @sebastienros . I added it for reference to JsonBench by bringing the parser from Parlot's repository.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 7 2700X, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=5.0.102
  [Host]     : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
  DefaultJob : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT

Method Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
BigJson_Pidgin 430.3 μs 1.12 μs 1.05 μs 1.00 0.00 24.9023 3.4180 - 101.7 KB
BigJson_Sprache 3,438.3 μs 15.24 μs 12.72 μs 7.99 0.04 1308.5938 50.7813 - 5349.63 KB
BigJson_Superpower 1,793.6 μs 7.88 μs 7.37 μs 4.17 0.02 222.6563 1.9531 - 913.43 KB
BigJson_FParsec 461.9 μs 2.86 μs 2.54 μs 1.07 0.01 83.9844 0.9766 - 344.68 KB
BigJson_Parlot 256.1 μs 0.43 μs 0.38 μs 0.60 0.00 24.9023 2.9297 - 101.8 KB
LongJson_Pidgin 383.8 μs 1.09 μs 0.97 μs 1.00 0.00 25.3906 2.9297 - 104.25 KB
LongJson_Sprache 2,812.0 μs 7.66 μs 7.17 μs 7.32 0.03 1054.6875 11.7188 - 4311.36 KB
LongJson_Superpower 1,458.4 μs 11.98 μs 10.62 μs 3.80 0.03 171.8750 3.9063 - 706.79 KB
LongJson_FParsec 420.2 μs 2.58 μs 2.41 μs 1.09 0.01 94.2383 1.4648 - 386.3 KB
LongJson_Parlot 213.5 μs 0.82 μs 0.73 μs 0.56 0.00 25.3906 0.7324 - 104.35 KB
DeepJson_Pidgin 499.2 μs 1.32 μs 1.23 μs 1.00 0.00 45.8984 0.9766 - 187.79 KB
DeepJson_Sprache 2,947.6 μs 8.96 μs 7.48 μs 5.91 0.02 554.6875 222.6563 - 2946.56 KB
DeepJson_FParsec 473.1 μs 1.24 μs 1.03 μs 0.95 0.00 84.4727 0.9766 - 346.43 KB
DeepJson_Parlot 171.5 μs 1.05 μs 0.93 μs 0.34 0.00 20.0195 - - 82.34 KB
WideJson_Pidgin 231.7 μs 0.67 μs 0.56 μs 1.00 0.00 11.7188 0.2441 - 48.42 KB
WideJson_Sprache 1,631.0 μs 5.51 μs 4.30 μs 7.04 0.02 683.5938 11.7188 - 2797.28 KB
WideJson_Superpower 899.7 μs 0.44 μs 0.41 μs 3.88 0.01 112.3047 1.9531 - 459.74 KB
WideJson_FParsec 190.4 μs 1.91 μs 1.69 μs 0.82 0.01 31.4941 3.9063 - 129.02 KB
WideJson_Parlot 155.9 μs 0.33 μs 0.30 μs 0.67 0.00 11.7188 0.4883 - 48.52 KB

@benjamin-hodgson
Copy link
Owner

benjamin-hodgson commented Jan 20, 2021

Interesting. Looks like I can no longer claim to be the fastest in C#! 😉 I'm curious where Parlot gets its speed from. Is it purely down to the fact that Parlot does less thorough error reporting?

@sebastienros
Copy link

I have no clue where the difference could be. But it's easier to make something faster when you have a baseline. If you want to use this as an opportunity I'd suggest to check why Pidgin allocates so much more for the DeepJson scenario. This has the most difference.

I am not sure what you mean with thorough error reporting. Maybe I am not aware of a specific feature in Pidgin. In Parlot errors are reported explicitly with a custom parser construct. So if this parser is reached (or the previous fails) the error is reported. The only limitation I am aware of right now is that there is a single error message, so I need to improve it to continue parsing and report more errors when possible.

What we paid attention to for perf is ref structs, not creating results when not necessary, removing interface dispatch, and having most things strongly typed. I think I saw a few boxing code paths in Pidgin at some point, that could be a difference. I had a hard time removing such code paths while maintaining some consistent API.

Maybe the main thing is that @lahma seems to like making my dumb code faster ;) He knows all the tricks to gain a few ns here and there.

@benjamin-hodgson
Copy link
Owner

benjamin-hodgson commented Jan 20, 2021

Re error reporting, Pidgin does quite a lot of work to keep track of what the parser was expecting to encounter, including across branches, so that I can give error messages like Expected "class" or "struct".

There's also a certain amount of overhead associated with supporting different types of input (that is, not always parsing from a string). That's one of the reasons I have a separate function to enable backtracking (Try) - I can't guarantee the data is in memory otherwise.

Beyond that, there might be some overhead in the implementation of the parsers themselves, rather than across-the-board costs (perhaps the loops themselves are not optimised). That seems quite directly tractable, if I can diagnose the worst performers!

@sebastienros
Copy link

If I were you I'd keep this PR around if you want to use it and make Pidgin faster. If you are willing to and have the time for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants