Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operator overloads for parser actions (#70) #114

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sirAndros
Copy link

Hello!
First, thank you for this awesome lib!
I'd testing it within pet project where monsters like ANTLR and Hime are overkill.
Declare parser with solely C# is really good but I found myself uncomfortable when some native actions like "OR" or "NOT" should be declared as method calls even for simple cases.
Found similar issue (#70), but it was long time ago and author seems to disappear.

This PR is proof of concept operator overloads API.

I've implemented hastily some overloads to show you API mockup in ApiSugarParserTests. Everything in this PR is discussible and with high probability should be improved before land into the next release.

"Not" operator is quite straight forward. Can't imagine what to improve :)
"Plus" operator from original issue (#70). This feature needs better design as you and will cause changes in SequenceParser's. Maybe we don't really need it.
"Or" operator is why I'm originally here. It also need some additional work to cover advanced cases (e.g. when apply not, optional etc. for one of operands)
"More" and "Less" operators is very questionable. It looks good to have some short ways to describe Before and Than but this operators can confuse some users. Bitshifts (>>/<<) operators can play here a little better but it's forbidden to use anything but int as second argument for these operators.

Any thought here are very welcome!
Regards, Andrei.

@benjamin-hodgson
Copy link
Owner

benjamin-hodgson commented Sep 9, 2021

Thank you very much for the contribution! I think this is quite a good idea.

  • >, <, |: I like these! >/< have a higher precedence than | so combining them does the right thing and looks like BNF. Looks like p > q < r returns q‘s result too. Very smooth.
  • !: Not is a little bit tricky to use I think. You usually (not always) need to combine it with Lookahead to get the desired behaviour. Most ordinary parsing tasks don’t require Not to start with. So I’m a little bit wary of adding an operator for this - in general the simplest syntax should be reserved for the most common operations
  • +: Sequence feels to me like a slightly strange choice for +. Is Sequence really so important as to warrant its own syntax? Arguably Or would be a better choice for +, I dunno.
  • TToken overloads and conversions: I see why you did this but I’m leaning against including these. Much of the time, users will have their own way to handle tokens (eg whitespace rules) so implicitly converting a token to a parser has fairly limited usefulness in my opinion (and liable to cause mistakes).
  • I do have some general misgivings about operator overloading. Usually when you see > or + in code you expect the arguments to be numbers! So in each case we need to make sure that the improvement in code readability is worth the general confusion that can be caused by this sort of API.

In summary: I like >, < and | and am leaning towards including them in the API because the benefit to readability is so high. The others, I’m not so sure about, would need to see some more concrete/real world examples

@sirAndros
Copy link
Author

Thank you for the fast reaction!
I fully understands your doubts. Actually, when I'd creating this PR, I hoped to include only "or" operator — all the other here is just to try out how it's looks like.

  1. As you correctly noted </> usually means comparation and may be misunderstood by some developers, but now I think that it will be OK once you used to it. For example, in shell it means stream redirection and we live fine with that.
    Furthermore, users can decide by themselves what is more clear for them: > or Before, and after second glance today it seems to me even more expressive than words. So, I'd like to see them in API with | if its possible.
  2. About + - agree, can drop it without second thoughts.
  3. About Not - we can reserve ! for something, but as for C-like syntax programmers it's very natural to say "not" with exclamation mark, but, actually, I rather consider using ~ for this as the action looks like inversion for me, instead of negation. But, again, me personally ain't using this operation frequently and I don't feel that my code become more readable with ! here. Maybe other community members have opinion here, but it could be another API extension iteration.
  4. Finally, about TToken overloads and conversions: my intention here was to bring the way to describe easiest char-based things in something more BNF-like, to make it easier to translate from/to eBNF syntax of another parsers (especially parser generators). Also you usually works with tokens first and only than do "parsing magic" into own types with Select etc.
    But currently it just not works smoothly. I don't investigate why it cause to stack overflow yet (and not sure that I will, actually), but I fear that it can't be the only bug caused by messing with implicit casting in that way, so I agree that it's not what need to be added right now, but I think that if it will be implemented right — it can become impressive way to work with your lib, in my opinion.

P.S. I don't know what to do next. Shall I close PR and you will reimplement the shown concept here or you will create a branch where we can merge this stuff and continue to work atop the changes already done here?

@benjamin-hodgson
Copy link
Owner

I don’t think this PR needs to be closed! You can make the changes on this branch and we’ll merge the PR for release in v3. (V3 is currently a “when I get around to finishing work on it” sort of deal.)

@sirAndros
Copy link
Author

Ok, thanks. This roadmap works for me :-D

@egil
Copy link

egil commented Jun 7, 2022

I don’t think this PR needs to be closed! You can make the changes on this branch and we’ll merge the PR for release in v3. (V3 is currently a “when I get around to finishing work on it” sort of deal.)

Did this make it v3?

It seems like a cool alternative to the LINQ style/query syntax, and could have less perf implications than the query syntax, as far as I can see.

@benjamin-hodgson
Copy link
Owner

It did not, I was awaiting a response from @sirAndros. You're welcome to pick it up if they're no longer interested!

@egil
Copy link

egil commented Jun 8, 2022

I might take you up on this when I get more familiar with the library and general concept of parser combinators.

@sirAndros
Copy link
Author

Hi, guys!
Sorry for disappearing. I actually don't have free time for this for couple of month ahead, so I will be glad if someone continue research of the remaining issues. I can give access to the branch of this PR to continue in it.
Thanks!

@egil
Copy link

egil commented Jul 4, 2022

Hi @sirAndros, saw your invite to your fork. My open source TODO list is unfortunately filled up. Have a new release of https://github.com/bUnit-dev/bUnit I have to work on since we're tracking the .NET 7 release, and those bastards over in the asp.net core repo have a scheduled 😁

Its been super fun getting learning Pidgin last month, but unfortunately don't have any more time to dig deep right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants