fix: add 168 missing lexer tokens to keyword rule by engalar · Pull Request #186 · mendixlabs/mxcli

engalar · 2026-04-10T16:21:44Z

Summary

The keyword rule in MDLParser.g4 allows lexer tokens to be used as identifiers (entity/attribute/enum names via qualifiedName). 168 word-type tokens were missing, causing parse failures when user-defined names matched keywords like Data, Filter, Match, Empty, Open, Container, Node, Activity, etc.
Reorganized the rule by category with alphabetical ordering for easy auditing
Removed 3 duplicate entries (STRUCTURES, PAGING, EXECUTE) and 1 phantom token (UI)

Before

-- Parse error: "extraneous input '.' expecting {',', ')'}"
$Issue = CREATE MyModule.OrderIssue (
    Status = MyModule.ENUM_IssueStatus.Open
);

After

-- All keyword-named enum values now parse correctly
$Issue = CREATE MyModule.OrderIssue (
    Status = MyModule.ENUM_IssueStatus.Open,     -- was broken
    Kind = MyModule.ENUM_Kind.Data,               -- was broken
    Mode = MyModule.ENUM_Mode.Filter,             -- was broken
    State = MyModule.ENUM_State.Empty,            -- was broken
    Type = MyModule.ENUM_Type.Container,          -- was broken
    Phase = MyModule.ENUM_Phase.Match,            -- was broken
    Method = MyModule.ENUM_M.Get                  -- was broken
);

Scope

Only the keyword rule in MDLParser.g4 changed (+ regenerated parser files). No executor, visitor, or AST changes.

Follow-up from PR #174 code review which identified this systemic gap.

Test plan

make test — all tests pass (including previously failing TestQuotedIdentifierInWidgetAttribute and TestShowPageMicroflowStyleArgsInWidget which were broken during development by accidentally dropping LAYOUT)
Smoke tested 15+ keyword-as-enum-value cases (Data, Filter, Match, Empty, Container, Open, Node, Activity, Condition, Get, Post, etc.)
No original keyword tokens dropped (verified via comm -23)
No duplicate tokens in final rule (verified via uniq -d)

github-actions · 2026-04-10T16:22:52Z

AI Code Review

What Looks Good

Clear problem/solution explanation: The PR effectively explains that 168 missing lexer tokens were causing parse failures when user-defined names matched keywords (like Data, Filter, Match), with concrete before/after examples showing the fix works.
Appropriate scope: The change is narrowly focused on fixing the keyword rule in MDLParser.g4 to resolve a systemic parsing gap identified in PR fix: add OPEN to keyword rule for enum value parsing #174, with no unnecessary modifications to executor, visitor, or AST layers (as correctly noted).
Maintainability improvements: Reorganized the keyword rule by category with alphabetical ordering for easier auditing, removed 3 duplicate entries and 1 phantom token (UI), improving code quality.
Thorough testing: Verified via make test (including previously failing tests), smoke-tested 15+ keyword-as-enum-value cases, confirmed no original tokens were dropped, and verified no duplicates remain.
Follows conventions: The PR correctly notes that generated parser files (mdl/grammar/parser/) are noise and should be skipped in review, aligning with project instructions.

Recommendation

Approve. This PR successfully resolves the parsing issue by adding missing tokens to the keyword rule, improves maintainability through reorganization and deduplication, and provides adequate test coverage. The scoped change appropriately addresses the foundation layer without requiring modifications to other pipeline components since it fixes core parsing functionality. No changes are needed.

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

github-actions · 2026-04-11T00:14:06Z

AI Code Review

What Looks Good

Bug Fix: Correctly addresses parse failures when user-defined names (entity/attribute/enum names) matched missing lexer tokens like Data, Filter, Match, etc. The before/after examples clearly demonstrate the fix works for enum values and other identifiers.
Thorough Reorganization: The keyword rule is now categorized by functional area (DDL/DML, Entity/Domain model, Types, etc.) with alphabetical ordering within each group, making future maintenance and auditing significantly easier.
Cleanup: Removed 3 duplicate entries (STRUCTURES, PAGING, EXECUTE) and 1 phantom token (UI) as stated, improving accuracy of the keyword list.
Verification: Explicitly confirmed no original tokens were dropped (comm -23) and no duplicates remain (uniq -d), showing due diligence.
Test Coverage: All tests pass including previously failing tests, plus manual smoke testing of 15+ keyword-as-enum-value cases.
Scoped Change: Limited to the grammar layer only (no AST/visitor/executor changes needed), which is appropriate for this type of parsing fix.
Generated Files: Properly regenerated and committed ANTLR parser files after grammar change, as required.
MDL Syntax Compliance: Maintains all existing MDL syntax patterns; the fix enables proper use of keywords as identifiers without altering the language design.

Recommendation

Approve. This is a well-executed bug fix that resolves a significant parsing gap while improving code maintainability through thoughtful reorganization. The change is minimal, focused, and thoroughly tested. No modifications to downstream pipeline components are needed or appropriate for this grammar-level fix.

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

ako

This is exactly the broad fix I recommended in the PR #174 review.

What's good

Solves the systemic problem: 168 word-type tokens missing from keyword is a significant gap. This affects enum values, attribute names, entity names, anywhere qualifiedName is used.
Categorical organization with 18 groups + alphabetical ordering makes future audits trivial — easy to find where to add new tokens
Cleanup: removes 3 duplicates (STRUCTURES, PAGING, EXECUTE) and 1 phantom (UI)
Pure grammar change — no executor/visitor/AST, lowest possible risk
Verification approach documented: comm -23 for missing tokens, uniq -d for duplicates

Concerns

No regression tests added. "Smoke tested 15+ cases" is good for confidence but doesn't prevent regression. A test file with CREATE ENUMERATION Test.E (Open, Data, Filter, Match, Empty, Container, Node, ...) would catch any future drift.

No automated lexer/keyword sync check. A make target that grep's word-type tokens from MDLLexer.g4 and verifies each is in the keyword rule (or explicitly excluded) would prevent this from happening again. Worth a follow-up issue.

PR #174 becomes a partial duplicate since this includes the OPEN fix. Should be closed when this merges.

LGTM.

github-actions · 2026-04-13T03:57:35Z

AI Code Review

What Looks Good

Correctly addresses the reported issue: The PR fixes parsing failures when user-defined names matched lexer keywords (like Data, Filter, Match), which was causing extraneous input '.' expecting {',', ')'} errors.
Thorough test coverage:
- Added a dedicated test file mdl-examples/doctype-tests/keyword-as-identifier.mdl with comprehensive examples covering enum values, entity names, and attributes that are keywords
- Added mdl/grammar/keyword_coverage_test.go to automatically verify all lexer tokens are included in the keyword rule, preventing future regressions
Improved maintainability:
- Reorganized the keyword rule by category with alphabetical ordering within each group
- Removed 3 duplicate entries (STRUCTURES, PAGING, EXECUTE) and 1 phantom token (UI)
- The structure makes future auditing and updates much easier
Proper pipeline execution:
- Grammar change in MDLParser.g4
- Parser regenerated (generated files updated as expected)
- No unnecessary changes to AST, visitor, or executor layers (as none were needed for this fix)
Follows MDL design principles:
- Allows keywords to be used as identifiers in appropriate contexts (enum values, entity/attribute names) without overloading keyword meanings
- Maintains correct usage of colon for property definitions (Key: value) and AS for name mappings (not applicable here but respected)
- Uses proper qualified names (Module.Element) in all test examples
- Statements read as natural English

Recommendation

Approve the PR. This is a well-focused fix that resolves a significant parsing gap while improving code maintainability through better organization and adding robust test coverage to prevent regressions. The change follows all required procedures for MDL syntax updates and properly wires through the parser layer of the pipeline (which is sufficient for this type of fix). The added tests provide excellent validation that the fix works correctly across various keyword-as-identifier scenarios.

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

The keyword rule in MDLParser.g4 allows lexer tokens to be used as identifiers (entity names, attribute names, enum values). Many tokens were missing, causing parse failures when user-defined names matched keywords like Data, Filter, Match, Empty, Open, Container, etc. This adds all word-type lexer tokens to the keyword rule, organized by category with alphabetical ordering for easy auditing. Also removes duplicate entries (STRUCTURES, PAGING, EXECUTE) and a phantom UI token.

…IABLE

github-actions · 2026-04-13T04:09:07Z

AI Code Review

What Looks Good

Fixes a significant parsing issue where 168 lexer tokens were missing from the keyword rule, causing parse failures when user-defined names matched keywords like Data, Filter, Match, etc.
Comprehensive solution: added all missing tokens rather than just the reported ones
Improved maintainability: reorganized rule by category with alphabetical ordering for easy auditing
Cleanup: removed 3 duplicate entries (STRUCTURES, PAGING, EXECUTE) and 1 phantom token (UI)
Excellent test coverage:
- Added keyword-as-identifier.mdl doctest showing practical usage
- Added keyword_coverage_test.go unit test that verifies all lexer tokens are in the keyword rule (will catch future omissions)
Verified no original tokens were dropped and no duplicates remain
All tests pass including previously failing ones (TestQuotedIdentifierInWidgetAttribute, TestShowPageMicroflowStyleArgsInWidget)
Smoke tested 15+ keyword-as-enum-value cases
Changes are scoped appropriately: only grammar change with regenerated parser files (no executor/visitor/AST changes needed)
Follows established patterns for allowing keywords as identifiers in specific contexts

Recommendation

Approve. This is a well-executed fix that addresses a critical parsing gap with comprehensive test coverage and improved code organization. The changes are focused, follow project patterns, and prevent similar regressions in the future.

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

…tests Replace commonNameKeyword with keyword in all identifier rules (attributeName, enumValueName, indexColumnName, imageName, parameterName, selectAlias) so any lexer token can be used as an identifier. Remove the now-redundant commonNameKeyword rule. Add keyword-as-identifier.mdl regression test covering enum values, entity names, and attribute names that match lexer keywords. Add keyword_coverage_test.go to verify lexer/keyword sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-13T04:19:07Z

AI Code Review

What Looks Good

The PR effectively addresses a significant gap in the MDL parser where 168 lexer tokens were missing from the keyword rule, causing parse failures when user-defined names matched keywords. The solution is comprehensive and well-executed:

Thorough fix: Added all missing tokens to the keyword rule, reorganized by category with alphabetical ordering for maintainability
Cleanup: Removed 3 duplicate entries and 1 phantom token (UI)
Test coverage:
- Added mdl-examples/doctype-tests/keyword-as-identifier.mdl with extensive test cases
- Added mdl/grammar/keyword_coverage_test.go to automatically verify lexer-token/keyword-rule parity
- All tests pass including previously failing tests
Correct scoping: As stated, only the grammar rule and generated parser files changed - no executor, visitor, or AST modifications were needed since this is purely a parsing/enablement fix
Maintainability improvements: The reorganization makes future audits easier, and the new test will prevent regressions
Verified fix: The before/after examples clearly demonstrate the issue is resolved for enum values like Open, Data, Filter, etc.

The changes follow the project's architecture correctly - since this affects only how identifiers are parsed (allowing keywords to be used as identifiers in appropriate contexts), it doesn't require changes down the stack to AST, visitor, or executor layers.

Recommendation

Approve the PR. This is a well-scoped, thoroughly tested bug fix that resolves a significant parsing gap without introducing any negative side effects. The added test coverage ensures this issue won't recur.

Automated review via OpenRouter (Nemotron Super 120B) — workflow source

ako

Both review concerns fully addressed in the two new commits:

Automated sync test (keyword_coverage_test.go) parses MDLLexer.tokens and verifies every word-type token is in the keyword rule — this prevents the class of bug from ever recurring. Also catches stale/extra entries. Exactly what I recommended.
Regression test file (keyword-as-identifier.mdl) with 117 lines of enumerations using keyword-named values (Open, Data, Filter, Match, Get, Post, Activity, Layout, Header, etc.).
Unified identifier rules to use the full keyword list — good consistency improvement.

LGTM.

github-actions bot mentioned this pull request Apr 11, 2026

Daily Digest: 2026-04-11 #191

Closed

ako approved these changes Apr 11, 2026

View reviewed changes

github-actions bot mentioned this pull request Apr 12, 2026

Daily Digest: 2026-04-12 #194

Closed

engalar mentioned this pull request Apr 13, 2026

fix: add OPEN to keyword rule for enum value parsing #174

Closed

3 tasks

engalar added 2 commits April 13, 2026 12:01

fix: add CONSTRAINT, DOCUMENTATION to keyword; remove unreachable VAR…

276c906

…IABLE

engalar force-pushed the fix/keyword-rule-missing-tokens branch from f4d9459 to 2780fe2 Compare April 13, 2026 04:07

engalar force-pushed the fix/keyword-rule-missing-tokens branch from 2780fe2 to 3e7d2d7 Compare April 13, 2026 04:17

ako approved these changes Apr 13, 2026

View reviewed changes

ako merged commit f0d16f0 into mendixlabs:main Apr 13, 2026
1 of 2 checks passed

github-actions bot mentioned this pull request Apr 14, 2026

Daily Digest: 2026-04-14 #197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add 168 missing lexer tokens to keyword rule#186

fix: add 168 missing lexer tokens to keyword rule#186
ako merged 3 commits intomendixlabs:mainfrom
engalar:fix/keyword-rule-missing-tokens

engalar commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 11, 2026

Uh oh!

ako left a comment

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

ako left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

engalar commented Apr 10, 2026

Summary

Before

After

Scope

Test plan

Uh oh!

github-actions bot commented Apr 10, 2026

AI Code Review

What Looks Good

Recommendation

Uh oh!

github-actions bot commented Apr 11, 2026

AI Code Review

What Looks Good

Recommendation

Uh oh!

ako left a comment

Choose a reason for hiding this comment

What's good

Concerns

Uh oh!

github-actions bot commented Apr 13, 2026

AI Code Review

What Looks Good

Recommendation

Uh oh!

github-actions bot commented Apr 13, 2026

AI Code Review

What Looks Good

Recommendation

Uh oh!

github-actions bot commented Apr 13, 2026

AI Code Review

What Looks Good

Recommendation

Uh oh!

ako left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants