-
Notifications
You must be signed in to change notification settings - Fork 54
fix(lexer): support size_t literal, separator chars, 'module' #241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The integer parsing grammars still have to be updated.
hkaiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
|
A few days ago I added user defined and standard library literal tokens. re2c works fine, but I ran into issues with xpressive. I hope I can get back to it soon. Should changes for that go into the same .. or a different branch? |
|
I withdraw my previous comment :) I think we should put all related functional changes into the same PR and handle all the various lexers in parallel. I do have a mild preference for separating features (keywords, number separators, literals), into separate commits/PRs, but if they are logically related it's fine. Any thoughts on the CI failures? |
|
I see you have also upgraded re2c from 1.0.2 to 4.1. Actually I'm very interested in trying newer versions of re2c, but I think we should make that a separate PR (and also record what we expect to get in terms of improvements etc.). Can you build with 1.0.2 for now? |
| #define HEXDIGIT "[0-9a-fA-F]" | ||
| #define OPTSIGN "[-+]?" | ||
| #define BINARYDIGIT "[01]" | ||
| #define SIGN "[-+]?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but can we leave this as OPTSIGN? More exact, and also this SIGN doesn't line up with the other strings
|
We should also think a bit about language version support here also. Digit separators were introduced in C++14, which we don't have a separate flag for, but you could argue "0x" would be appropriate at least. size_t literals are C++23 and we don't have a flag for that yet. |
- Fix RE2C code for numbers (binary and digit separators) - Revert to RE2C version 1.0.2, for now - Revamp token ids to minimize changes - Restore existing and more accurate name OPTSIGN in slex - Add binary literal support to lexertl - Fix xlex support for size_t literals - Add test tokens for octal, binary, and hex literals
|
Thats great! - I was pulled off into other projects and had no more time working on this in the last few weeks. |
|
Closed in favor of the related #242 which is a superset |
- Fix RE2C code for numbers (binary and digit separators) - Revert to RE2C version 1.0.2, for now - Revamp token ids to minimize changes - Restore existing and more accurate name OPTSIGN in slex - Add binary literal support to lexertl - Fix xlex support for size_t literals - Add test tokens for octal, binary, and hex literals
This extends the lexers with some of the things that seemed to be missing.
For the separator chars in number literals ', the spirit parsers would need to be adjusted.
I do not remember, whether that could be done by wrapping a scanner or by replacing the parsers..
Additionally I was not sure, at what point of wave these spirit grammars play a role.
After that I am curious about user defined literals...