Skip to content

Conversation

@APokorny
Copy link
Contributor

This extends the lexers with some of the things that seemed to be missing.
For the separator chars in number literals ', the spirit parsers would need to be adjusted.
I do not remember, whether that could be done by wrapping a scanner or by replacing the parsers..
Additionally I was not sure, at what point of wave these spirit grammars play a role.
After that I am curious about user defined literals...

The integer parsing grammars still have to be updated.
Copy link
Collaborator

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@APokorny
Copy link
Contributor Author

APokorny commented Sep 5, 2025

A few days ago I added user defined and standard library literal tokens. re2c works fine, but I ran into issues with xpressive. I hope I can get back to it soon. Should changes for that go into the same .. or a different branch?

@jefftrull
Copy link
Collaborator

I withdraw my previous comment :) I think we should put all related functional changes into the same PR and handle all the various lexers in parallel.

I do have a mild preference for separating features (keywords, number separators, literals), into separate commits/PRs, but if they are logically related it's fine.

Any thoughts on the CI failures?

@jefftrull
Copy link
Collaborator

I see you have also upgraded re2c from 1.0.2 to 4.1. Actually I'm very interested in trying newer versions of re2c, but I think we should make that a separate PR (and also record what we expect to get in terms of improvements etc.). Can you build with 1.0.2 for now?

#define HEXDIGIT "[0-9a-fA-F]"
#define OPTSIGN "[-+]?"
#define BINARYDIGIT "[01]"
#define SIGN "[-+]?"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but can we leave this as OPTSIGN? More exact, and also this SIGN doesn't line up with the other strings

@jefftrull
Copy link
Collaborator

We should also think a bit about language version support here also. Digit separators were introduced in C++14, which we don't have a separate flag for, but you could argue "0x" would be appropriate at least. size_t literals are C++23 and we don't have a flag for that yet.

jefftrull added a commit to jefftrull/wave that referenced this pull request Oct 25, 2025
- Fix RE2C code for numbers (binary and digit separators)
- Revert to RE2C version 1.0.2, for now
- Revamp token ids to minimize changes
- Restore existing and more accurate name OPTSIGN in slex
- Add binary literal support to lexertl
- Fix xlex support for size_t literals
- Add test tokens for octal, binary, and hex literals
@jefftrull jefftrull marked this pull request as draft October 26, 2025 23:57
@jefftrull
Copy link
Collaborator

@APokorny I have combined this PR with some bug fixes and the feedback I mentioned above; if you're comfortable with it I will merge #242

@APokorny
Copy link
Contributor Author

Thats great! - I was pulled off into other projects and had no more time working on this in the last few weeks.

@jefftrull
Copy link
Collaborator

jefftrull commented Oct 27, 2025

Closed in favor of the related #242 which is a superset

@jefftrull jefftrull closed this Oct 27, 2025
jefftrull added a commit that referenced this pull request Oct 27, 2025
- Fix RE2C code for numbers (binary and digit separators)
- Revert to RE2C version 1.0.2, for now
- Revamp token ids to minimize changes
- Restore existing and more accurate name OPTSIGN in slex
- Add binary literal support to lexertl
- Fix xlex support for size_t literals
- Add test tokens for octal, binary, and hex literals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants