-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turkish I issue #110
Comments
So, there is actually an inconsistency in Pegasus' handling of the ignore case flag. For strings, this logic is used:
Whereas for character ranges, this logic is used:
It sounds like we need to make this consistent and configurable, yes? Do you feel that it is likely that any given parser would need to specify Current Culture, Invariant Culture, and Ordinal in different spots in their parser? It is important to note that changing these defaults will require a major version bump, as it is a breaking change. |
Yes I think there is a bit inconsistent for strings and char ranges. Also in same file in line 228 there is a character range comparison like;
this is also needed to be culture aware. I think it's currently comparing by current culture implicitly. To answer your question maybe it's best to give example. I'm using pegasus to parse SQL like expressions. For aggregation formula i'm using;
This is should be invariant because when type uppercase "COUNTDISTINCT" it's failing because "I" != "i" in current culture. For table or column identifiers i used;
This is should be culture aware because columns may include non-english chars. For example: "AlıcıŞube" Maybe it's best to pass an optional a culture parameter with current culture default and redefine regex "i" as ; "regex"i as invariant |
A workaround for Pegasus 4.1 is to use a more specific lexical structure as C# does, e.g. using char.IsLetter:
For 5.0, does it make sense to use |
Hi,
I think there is a Turkish-I problem in Pegasus/Compiler/CodeGenerator/Grammar.weave file with 239 line.
While using "[a-z]i" regex expression string comparison is done with CurrentCultureIgnoreCase and parser cannot match any thing if "I" character is used. Its should be InvariantCultureIgnoreCase to solve this problem. I have solve this with "[a-zA-Z]i" as workaround for now. Maybe its best to pass string comparison enumeration to parser for other scenarios.
You can find details from https://blog.codinghorror.com/whats-wrong-with-turkey/
The text was updated successfully, but these errors were encountered: