Add datetime quickcheck coverage and exponent-like key parsing fixes#63
Add datetime quickcheck coverage and exponent-like key parsing fixes#63
Conversation
| // (underscores, exponent notation), produce an Identifier, including for negative bare keys like "-1e2". | ||
| let has_lossy_chars = raw_s != s || (is_float && !raw_s.contains(".")) | ||
| if !is_negative && !has_sign && has_lossy_chars { | ||
| if !has_sign && has_lossy_chars { |
There was a problem hiding this comment.
🔴 Removing !is_negative guard causes negative numbers with underscores in arrays to be rejected
The change from if !is_negative && !has_sign && has_lossy_chars to if !has_sign && has_lossy_chars at internal/tokenize/tokenize.mbt:883 allows negative numbers with underscores (e.g., -1_000) to enter the key-position heuristic. When such a number is the last element in an array (e.g., arr = [-1_000]), the ] at line 888 triggers is_key = true, causing the tokenizer to emit Identifier("-1_000") instead of IntegerToken(-1000). The parser's parse_exponent_like_identifier at parser.mbt:409-418 only handles identifiers containing e/E, so it returns None for "-1_000", resulting in an "Expected value" error. Before this PR, !is_negative was false for negative numbers, so the key-position check was skipped entirely and -1_000 always became IntegerToken(-1000). This regression affects any negative number with underscores as the last/only element in an array (e.g., [-1_000], [1, -2_000], [-1.5_0]).
Prompt for agents
The fix needs to address two locations:
1. In internal/tokenize/tokenize.mbt at line 883: The condition `if !has_sign && has_lossy_chars` is too broad. It should either restore the `!is_negative` check and handle the negative exponent key case differently, or the parser's fallback needs to be extended.
2. In parser.mbt at lines 409-418 (parse_exponent_like_identifier): If keeping the tokenizer change, this function needs to handle the underscore case too. It should strip underscores from the name before attempting parse_double, and also handle pure integer identifiers (no 'e'/'E') by trying parse_int64 after stripping underscores. For example:
- If the name contains underscores, strip them and try parsing as integer (via @string.parse_int64) or float (via @string.parse_double)
- Return TomlInteger for integer-like identifiers and TomlFloat for float-like ones
- This would require changing the return type or having the caller handle both cases
The simplest fix is option 1: change line 883 back to `if !is_negative && !has_sign && has_lossy_chars` and handle the negative exponent-like key case separately in the tokenizer, perhaps by adding a specific check for negative exponent patterns before the general has_lossy_chars check.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
internal/qc_modellayoutmainTesting