Fix range in Oniguruma regex for Hack grammar (#159)#162
Closed
slevithan wants to merge 1 commit intoslackhq:masterfrom
slevithan:regexfix
Closed
Fix range in Oniguruma regex for Hack grammar (#159)#162slevithan wants to merge 1 commit intoslackhq:masterfrom slevithan:regexfix
slevithan wants to merge 1 commit intoslackhq:masterfrom
slevithan:regexfix
Conversation
|
Thanks for the contribution! Before we can merge this, we need @slevithan to sign the Salesforce Inc. Contributor License Agreement. |
Contributor
Author
|
Heads up that there was a server error when I signed the CLA at the link above, and now when I return to the Salesforce contributor license agreement page and authenticate using GitHub, it tells me "You already signed the CLA on 2024-12-16". |
Contributor
Author
|
Created a new PR #163 to correctly trigger the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #159. See extended details there, but essentially,
\xHHfor values above7Fdoesn't the work the same in Oniguruma (the regex engine used by TextMate grammars) as in other regex engines. As a result, a standalone\xffis an invalid UTF-8 encoded byte value, rather than a valid code point value as it would be if using\x{ff}. So although this regex isn't throwing in Oniguruma due to loose error handling for encoded bytes that are never used in a valid UTF-8 encoded byte sequence (bytesF5throughFF), this is in fact an invalid Oniguruma pattern.This error is currently leading to edge case bugs in Oniguruma (code points above
FFare not matched by this negated range) and preventing the Hack grammar from working with Shiki's JS engine (which has stricter error handling for this invalid Oniguruma pattern).