Skip to content

Replace pyparsing with lark for better error messages#124

Open
pkienzle wants to merge 19 commits into
masterfrom
use-lark-parse
Open

Replace pyparsing with lark for better error messages#124
pkienzle wants to merge 19 commits into
masterfrom
use-lark-parse

Conversation

@pkienzle
Copy link
Copy Markdown
Collaborator

@pkienzle pkienzle commented Mar 5, 2026

This PR replaces pyparsing with lark.

It does a better job of error messages (#34), and it is more robust than the existing parser.

The syntax is represented as a string using EBNF notation. This nicely separates it from parsing and interpretation. The format is clean enough to use directly in the documentation.

grammar = """
start : SPACE? formula SPACE? # strip blank space from start and end
# formula: composite @ density | str:sequence @ density | mixture
formula : compound | mixture
compound : (composite | fasta) [density]
# Density applies to the entire composite, such as "NaCl + 29.2H2O @ 1.07n"
# For the density of a mixture you need parentheses: "(10 wt% NaCl // H2O)@1.07n"
# Activation only cares about total mass, so you can freely mix masses and volumes if
# you have the density for each component. For scattering you need the density of the
# mixture. When this is different from the mixture of densities use (mixture)@density.
# For thin film samples, allow stacking of layers with the thickness of each layer.
# With density for each layer the relative quantities of each element in the stack can
# be calculated. Convert to mass by multiplying density by thickness (cm) and area (cm²).
# mixture: quantity compound // quantity compound // quantity compound
mixture : byamount | byvolume | byweight | layers
byamount : quantity compound (MIX quantity compound)*
byvolume : volumepct compound (MIX percentage compound)* MIX compound
byweight : weightpct compound (MIX percentage compound)* MIX compound
layers : thickness compound (MIX thickness compound)*
quantity : NUMBER SPACE? (MASS | VOLUME) SPACE
weightpct : NUMBER SPACE? WEIGHTPCT SPACE
volumepct : NUMBER SPACE? VOLUMEPCT SPACE
thickness : NUMBER SPACE? LENGTH SPACE
percentage : NUMBER SPACE? "%" SPACE # Allows "3 % "
MIX : SPACE? "//" SPACE?
WEIGHTPCT : /%w((eigh)?t)?/ | /w((eigh)?t)?%/ | /%m(ass)?/ | /m(ass)?%/
VOLUMEPCT : /%v(ol(ume)?)?/ | /v(ol(ume)?)?%/
MASS : "kg" | "g" | "mg" | "ug" | "μg" | "ng"
VOLUME : "L" | "mL" | "uL" | "μL" | "nL"
LENGTH : "cm" | "mm" | "um" | "μm" | "nm" | "Ang" | "Å"
# FASTA sequence: (rna|dna|aa):SEQUENCE @ density
fasta : FASTA ":" SEQUENCE
FASTA : /[a-z]+/ # str:sequence reports better errors than /dna|rna|aa/:sequence
SEQUENCE : /[-A-Z *]+/
# composite: number group number group ... @density
# group: El count El count ...
# Note: optional `[token]` leaves a None placeholder in the tree, unlike `token?`
composite : [NUMBER] group (SEPARATOR [NUMBER] group)*
group : ((atom | isoatom | "(" formula ")") [COUNT])+
atom : SYMBOL [isotope] [valence]
isoatom : SUPERINT SYMBOL [valence] # For example ²H for deuterium
isotope : "[" INTEGER "]"
valence : "{" [INTEGER] CHARGE "}" | [SUPERINT] SUPERCHARGE
density : SPACE? "@" SPACE? DENSITY [DENSITYMODE]
# could list all elements, but better error reporting if element symbol lookup fails
SYMBOL : /[A-Z][a-z]*/
CHARGE : /[+]+|[-]+/ # allow valence using {++} or {--}
SUPERCHARGE: /\u207A+|\u207B+/ # unicode valence such as Ca⁺⁺ and O²⁻
DENSITY : NUMBER # using alias DENSITY for number for better error reporting
DENSITYMODE: /[ni]/ # n=natural density, i=isotopic density
COUNT : NUMBER | SUBNUM # atom counts can be normal numbers or unicode subscripts
SEPARATOR : SPACE? /[+•·]/ SPACE? | SPACE # For example, CaCO₃·6H₂O
SPACE : /[ \\t\\n\\r]+/
NUMBER : INTEGER | FRACTION
INTEGER : /[1-9][0-9]*/
FRACTION : /([1-9][0-9]*|0)?[.][0-9]*/ # allow all floats?
SUBNUM : SUBINT | SUBFRAC
SUBINT : /(\u2080|[\u2081-\u2089][\u2080-\u2089]*)/
SUBFRAC : /(\u2080|[\u2081-\u2089][\u2080-\u2089]*|)([.][\u2080-\u2089]*)/
SUPERINT : /(\u2070|[\u00B9\u00B2\u00B3\u2074-\u2079][\u2070\u00B9\u00B2\u00B3\u2074-\u2079]*)/
"""

@bpedersen2 Please check if this works with the activation calculator at FRM2: https://webapps.frm2.tum.de/activation/

@pkienzle
Copy link
Copy Markdown
Collaborator Author

pkienzle commented May 22, 2026

Most of the error messages are an improvement over the old parser, though some of them are still confusing.

Here's a mix of accepted and rejected formulas along with the associated error messages. Note that lines marked with ## didn't parse in the old parser.

$ python -m periodictable.lark_parse


# === Composite tests ===

*** Co
 => Co @ 8.90

*** H2SO4
 => H₂SO₄

*** CaCO3
 => CaCO₃

*** CaCO₃
 => CaCO₃

*** (Co@5)       ##
 => Co @ 5.00

*** (((Co@5)@6)) ##
 => Co @ 6.00

*** CaCO3+6H2O
 => CaCO₃(H₂O)₆

*** CaCO3 6H2O
 => CaCO₃(H₂O)₆

*** CaCO3(H2O)6
 => CaCO₃(H₂O)₆

*** CaCO3 (H2O)6
 => CaCO₃(H₂O)₆

*** (Ca(CO3)((H2O)6))
 => CaCO₃(H₂O)₆

*** CaCO₃·6H₂O   ##
 => CaCO₃(H₂O)₆

!!!  Bl2Oh   # bad symbol
Element Bl doesn't exist

!!!  (Co     # mismatched LPAR
Expected one of @DENSITY[ni] NUMBER SYMBOL in
 (Co     
        ^

!!!  Co)     # mismatched RPAR
Expected one of @DENSITY[ni] COUNT SYMBOL in
 Co)     
   ^

!!!  ((Co)   # mismatched LPAR
Expected one of @DENSITY[ni] NUMBER SYMBOL in
 ((Co)   
        ^

!!!  ₃H2O    # badly placed subscript
Expected one of NUMBER SYMBOL aa:SEQ in
 ₃H2O    
 ^

# === Isotope tests ===

*** DHO
 => DHO

*** H[1]
 => ¹H @ 0.07

*** ¹⁸O₂
 => ¹⁸O₂ @ 1.28

!!!  Fe[56O2  # bad isotope syntax
Expected ] in
 Fe[56O2  
      ^

!!!  Co[181]  # bad isotope
'181 is not an isotope of Co'

# === Valence tests ===

*** Ca{2+}
 => Ca²⁺ @ 1.55

*** Ca{++}
 => Ca²⁺ @ 1.55

*** Ca⁺⁺   ##
 => Ca²⁺ @ 1.55

*** O{2-}
 => O²⁻ @ 1.14

*** O{--}
 => O²⁻ @ 1.14

*** O²⁻    ##
 => O²⁻ @ 1.14

*** H{+}
 => H⁺ @ 0.07

*** H{-}
 => H⁻ @ 0.07

*** HO{1-}    # HO- applies to the group, but valence is attached to O
 => HO⁻

*** H[1]{1-}O
 => ¹H⁻O

*** ²H⁺       # D{+} ##
 => D⁺ @ 0.14

*** O²H⁻      # no ambiguity since valence requires a trailing + or - ##
 => OD⁻

*** O²⁻H⁺     # O{2-}H{+} ##
 => O²⁻H⁺

*** O²⁻²H⁺    # O{2-}D{+} ##
 => O²⁻D⁺

!!!  Ca{2}   # missing charge in valence
Expected CHARGE[+-] in
 Ca{2}   
     ^

!!!  Ca{2++} # can't use number++
Use 2+ instead of 2++ for valence

!!!  Ca{2+O2 # missing close brace on valence
Expected } in
 Ca{2+O2 
      ^

!!!  Co{17-} # bad valence value
valence 17- is not valid for Co

!!!  Ca ⁺⁺   # extra space before valence
Expected one of @DENSITY[ni] NUMBER SYMBOL in
 Ca ⁺⁺   
    ^

!!!  Ca++    # missing braces in valence: the + is acting as SEPARATOR
Expected one of NUMBER SYMBOL in
 Ca++    
    ^

!!!  Ca2+    # missing braces in valence: the 2 is acting as COUNT and the + as SEPARATOR
Expected one of NUMBER SYMBOL in
 Ca2+    
        ^

!!!# Should be looking for SUPERCHARGE (e.g., O²⁻) or SYMBOL (e.g., O²H)
Expected SYMBOL in
 O²      
   ^

# === Density tests ===

*** H2O@1               # density is 1, where H and O use natural abundance
 => H₂O @ 1.00

*** H2O @ 1             # spaces allowed around '@' ##
 => H₂O @ 1.00

*** D2O@1n              # natural density "n" is 1 so isotopic density is 1.11
 => D₂O @ 1.11

*** D2O@1.11i           # isotopic density is 1.11
 => D₂O @ 1.11

*** D2O@1.11            # default is "i" for isotopic density
 => D₂O @ 1.11

*** C3H4H[1]NO@1.29n    # another natural density example
 => C₃H₄¹HNO @ 1.29

*** 78.2H2O[16] + 21.8H2O[18] @1n  # density applies to composite
 => (H₂¹⁶O)₇₈.₂(H₂¹⁸O)₂₁.₈ @ 1.02

!!!  3g Ca@ // 5g Si   # missing density value
Expected @DENSITY[ni] in
 3g Ca@ // 5g Si   
        ^

!!!  Ca@i              # missing density value  ##
Expected @DENSITY[ni] in
 Ca@i              
    ^

!!!  H2O@1h            # bad density mode
Expected @DENSITY[ni] in
 H2O@1h            
      ^

# === Mixture tests ===

*** 50 wt% Co // Ti                 # mix by mass; final component does need percentage
 => CoTi₁.₂₃₁₁₈₆₂₈₇₀₀₃₅₇₂₆ @ 6.01

*** 33 wt% Co // 33% Fe // Ti       # intermediate components need percentage
 => CoFe₁.₀₅₅₂₉₉₃₈₂₂₁₈₆₄₁Ti₁.₂₆₈₄₉₄₉₆₂₃₆₇₃₁₇₂ @ 6.50

!!!  93 wt% Co // 33% Fe // Ti     # more than 100 wt%
Total weight 126% is more than 100% in wt% mixture

!!!  93 vol% Co // 33% Fe // Ti    # more than 100 vol%
Total volume 126% is more than 100% in vol% mixture

*** 20 vol% (10 wt% NaCl@2.16 // H2O@1) // D2O@1n
 => NaCl(H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁(D₂O)₁₂₂.₇₈₉₅₃₅₈₈₉₁₄₅₈₅ @ 1.10

*** 5g NaCl // 50mL H2O@1           # volume components need density to determine mass fraction
 => NaCl(H₂O)₃₂.₄₃₉₅₀₅₅₆₇₅₈₂₅₇

*** 5g NaCl@2.16 // 50mL H2O@1      # need component densities to estimate mixture density
 => NaCl(H₂O)₃₂.₄₃₉₅₀₅₅₆₇₅₈₂₅₇ @ 1.05

*** NaCl(H2O)29.1966(D2O)122.794@1.10i  # mixture rendered as formula
 => NaCl(H₂O)₂₉.₁₉₆₆(D₂O)₁₂₂.₇₉₄ @ 1.10

!!!  5g NaCl // 50mL H2O           # need density for H2O to convert volume to mass
Need the mass density of H2O

*** (10 wt% NaCl // H2O)@1.07n      # set density of a mixture
 => NaCl(H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁ @ 1.07

*** 50 mL (45 mL H2O@1 // 5 g NaCl)@1.0707 // 20 mL D2O@1n
 => (H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁NaCl(D₂O)₁₂.₁₁₈₉₈₉₆₅₈₁₉₈₄₀₂ @ 1.08

*** 1 cm Si // 5 nm Cr // 10 nm Au
 => Si₁₁₉₉₉₂₂.₉₇₃₇₄₆₂₄₉₅CrAu₁.₄₁₇₂₁₈₀₀₉₃₁₂₁₃₆ @ 2.33

!!!  4 nm NaCl@2.17// 50 g Si      # can't use mass in layer mixture
Expected UNIT[mm] in
 4 nm NaCl@2.17// 50 g Si      
                     ^

!!!  3..5 mg NaCl                  # bad number format
Expected one of SYMBOL UNIT[mL] UNIT[mg] UNIT[mm] vol% wt% in
 3..5 mg NaCl                  
   ^

!!!  5 Mg NaCl // 50mL H2O@1       # bad units
Expected one of UNIT[mL] UNIT[mg] UNIT[mm] vol% wt% in
 5 Mg NaCl // 50mL H2O@1       
   ^

!!!  3.5 fm Si                     # bad units; expecting wt%/vol% or LENGTH, VOLUME, MASS
Expected one of UNIT[mL] UNIT[mg] UNIT[mm] vol% wt% in
 3.5 fm Si                     
     ^

!!!  3.5 mm Si // 2.5 nm SiO2 //   # missing final component of mixture
Expected NUMBER in
 3.5 mm Si // 2.5 nm SiO2 //   
                              ^

!!!  3.5 mm Si // 2.5 nm SiO2 // 35 mm cG      # bad final component of mixture
Expected :SEQ in
 3.5 mm Si // 2.5 nm SiO2 // 35 mm cG      
                                    ^

!!!  // 3g Ca                      # // is not a comment
Expected one of NUMBER SYMBOL aa:SEQ in
 // 3g Ca                      
 ^

!!!  37 vol% H2O@1 / 5% D2O@1      # missing /
Expected one of // @DENSITY[ni] in
 37 vol% H2O@1 / 5% D2O@1      
              ^

!!!  37 vol% H2O@1 /// 5% D2O@1    # extra /
Expected one of NUMBER SYMBOL aa:SEQ in
 37 vol% H2O@1 /// 5% D2O@1    
                 ^

!!!  37 vol% NaCl@2.16 // H2O@1 // D2O@1  # percent missing in middle part
Expected end of formula in
 37 vol% NaCl@2.16 // H2O@1 // D2O@1  
                            ^

!!!  37 vol% H2O@1 // 5% D2O@1     # percent not allowed in last part
Expected one of // @DENSITY[ni] in
 37 vol% H2O@1 // 5% D2O@1     
                          ^

!!!  37 vol% H2O@1 // 5 vol% D2O@1 # only % in subsequent parts
Expected % in
 37 vol% H2O@1 // 5 vol% D2O@1 
                    ^

!!!  37% H2O@1 // D2O@1            # missing vol% or wt%
Expected one of SYMBOL UNIT[mL] UNIT[mg] UNIT[mm] vol% wt% in
 37% H2O@1 // D2O@1            
   ^

!!!  37 val% H2O@1 // D2O@1        # bad spelling of vol%
Expected one of UNIT[mL] UNIT[mg] UNIT[mm] vol% wt% in
 37 val% H2O@1 // D2O@1        
    ^

# === FASTA tests ===

*** dna:CAGT
 => C₃₉H₃₇¹H₁₀N₁₅O₂₅P₄ @ 1.69

*** dna:CAGT @1n  # can override the density of a FASTA sequence
 => C₃₉H₃₇¹H₁₀N₁₅O₂₅P₄ @ 1.00

*** aa:RELEELNVPGEIVESLSSSEESITRINKKIEKFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPIPNSLPQNIPPLTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPFTESQSLTLTDVENLHLPLPLLQSWMHQPHQPLPPTVMFPPQSVLSLSQSKVLPVPQKAVPYPQRDMPIQAFLLYQEPVLGPVRGPFPIIV
 => C₁₀₈₀H₁₃₇₀¹H₃₁₉N₂₆₈O₃₁₀S₆ @ 1.27

!!!  DNA:CAGT    # incorrect case for FASTA type
Expected one of @DENSITY[ni] COUNT SYMBOL in
 DNA:CAGT    
    ^

!!!  dna CAGT    # missing colon between FASTA type and sequence
Expected :SEQ in
 dna CAGT    
    ^

!!!  bad:CAGT    # bad FASTA sequence type
Invalid fasta sequence type 'bad:'

@pkienzle pkienzle marked this pull request as ready for review May 22, 2026 16:26
@pkienzle
Copy link
Copy Markdown
Collaborator Author

pkienzle commented May 22, 2026

Here are the error messages from pyparsing for comparison (but in a different order):

!!!  DNA:CAGT  # incorrect case for FASTA type not properly identified
unknown element A

!!!  dna CAGT  # missing colon in FASTA
Expected end of text, found 'dna'  (at char 1), (line:1, col:2)

!!!# SUPERCHARGE should be the only valid token here
Expected end of text, found '²'  (at char 2), (line:1, col:3)

!!!  ₃H2O  # badly placed subscript
Expected end of text, found ''  (at char 1), (line:1, col:2)

!!!  // 3g Ca  # // is not a comment
Expected end of text, found '/'  (at char 1), (line:1, col:2)

!!!  3g Ca@ // 5g Si # missing density value
Expected end of text, found '3g'  (at char 1), (line:1, col:2)

!!!  Ca@i  # missing density value  ##
!!! pyparsing fails

!!!  Ca ⁺⁺  # extra space before valence
Expected end of text, found ''  (at char 4), (line:1, col:5)

!!!  Ca++  # missing braces in valence: the + is acting as SEPARATOR
Expected end of text, found '+'  (at char 3), (line:1, col:4)

!!!  Ca2+  # missing braces in valence: the 2 is acting as COUNT and the + as SEPARATOR
Expected end of text, found '+'  (at char 4), (line:1, col:5)

!!!  Ca{2}  # missing charge in valence
Expected end of text, found '{'  (at char 3), (line:1, col:4)

!!!  37 vol% H2O@1 / 5% D2O@1  # missing /
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  37 vol% H2O@1 /// 5% D2O@1  # extra /
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  H2O@1h  # bad density mode
Expected end of text, found 'h'  (at char 6), (line:1, col:7)

!!!  37 vol% NaCl@2.16 // H2O@1 // D2O@1  # percent missing in middle part
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  37 vol% H2O@1 // 5% D2O@1  # percent not allowed in last part
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  37 vol% H2O@1 // 5 vol% D2O@1  # only % in subsequent parts
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  37% H2O@1 // D2O@1  # missing vol% or wt%
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  37 val% H2O@1 // D2O@1  # bad spelling of vol%
Expected end of text, found '37'  (at char 1), (line:1, col:2)

!!!  Fe[56O2 # bad isotope syntax
Expected end of text, found '['  (at char 3), (line:1, col:4)

!!!  Co[181]  # bad isotope
'181 is not an isotope of Co'

!!!  Ca{2+O2  # bad valence syntax
Expected end of text, found '{'  (at char 3), (line:1, col:4)

!!!  Co{17-}  # bad valence
valence 17- is not valid for Co

!!!  3..5 mg NaCl
Expected end of text, found '3'  (at char 1), (line:1, col:2)

!!!  3.5 fm Si # bad units at the start; could be wt%/vol% or LENGTH, VOLUME, MASS 
Expected end of text, found '3'  (at char 1), (line:1, col:2)

!!!  3.5 mm Si // 2.5 nm SiO2 //
Expected end of text, found '3'  (at char 1), (line:1, col:2)

!!!  3.5 mm Si // 2.5 nm SiO2 // 35 mm cG
Expected end of text, found '3'  (at char 1), (line:1, col:2)

!!!  ((Co) # mismatched LPAR
Expected end of text, found '('  (at char 1), (line:1, col:2)

!!!  Co)  # mismatched RPAR
Expected end of text, found ')'  (at char 3), (line:1, col:4)

!!!  bad:CAGT  # bad sequence type
Expected end of text, found 'bad'  (at char 1), (line:1, col:2)

*** Co
 => Co @ 8.90

*** dna:CAGT
 => C₃₉H₃₇¹H₁₀N₁₅O₂₅P₄ @ 1.69

*** (Co@5) ##
!!! pyparsing fails

*** (((Co@5)@6)) ##
!!! pyparsing fails

*** CaCO3
 => CaCO₃

*** CaCO₃
 => CaCO₃

*** CaCO3+6H2O
 => CaCO₃(H₂O)₆

*** CaCO3 6H2O
 => CaCO₃(H₂O)₆

*** CaCO3(H2O)6
 => CaCO₃(H₂O)₆

*** CaCO3 (H2O)6
 => CaCO₃(H₂O)₆

*** (Ca(CO3)((H2O)6))
 => CaCO₃(H₂O)₆

*** CaCO₃·6H₂O  ##
!!! pyparsing fails

*** DHO
 => DHO

!!! Ca{2++}  # bad valence string
Expected end of text, found '{'  (at char 2), (line:1, col:3)

*** Ca⁺⁺  # also Ca{2+}  ##
!!! pyparsing fails

*** O²⁻   ##
!!! pyparsing fails

*** H[1]
 => ¹H @ 0.07

*** ²H⁺    # D{+} ##
!!! pyparsing fails

*** O²H⁻   # OD{-} ##
!!! pyparsing fails

*** O²⁻H⁺  # O{2-}H{+} ##
!!! pyparsing fails

*** O²⁻²H⁺ # O{2-}D{+} ##
!!! pyparsing fails

*** H2O@1
 => H₂O @ 1.00

*** D2O@1n
 => D₂O @ 1.11

*** D2O @ 1.11  ##
!!! pyparsing fails

*** D2O@1.11i
 => D₂O @ 1.11

*** HO{1-}
 => HO⁻

*** H[1]{1-}O
 => ¹H⁻O

*** H2SO4
 => H₂SO₄

*** C3H4H[1]NO@1.29n
 => C₃H₄¹HNO @ 1.29

*** 78.2H2O[16] + 21.8H2O[18] @1n  # density applies to composite
 => (H₂¹⁶O)₇₈.₂(H₂¹8O)₂₁.₈ @ 1.02

*** dna:CAGT @1n  # fasta density override
 => C₃₉H₃₇¹H₁₀N₁₅O₂₅P₄ @ 1.00

*** 50 wt% Co // Ti
 => CoTi₁.₂₃₁₁₈₆₂₈₇₀₀₃₅₇₂₆ @ 6.01

*** 33 wt% Co // 33% Fe // Ti
 => CoFe₁.₀₅₅₂₉₉₃₈₂₂₁₈₆₄₁Ti₁.₂₆₈₄₉₄₉₆₂₃₆₇₃₁₇₂ @ 6.50

!!!  93 wt% Co // 33% Fe // Ti  # More than 100 wt%
Expected end of text, found '93'  (at char 1), (line:1, col:2)

!!!  93 vol% Co // 33% Fe // Ti  # More than 100 vol%
Expected end of text, found '93'  (at char 1), (line:1, col:2)

*** 20 vol% (10 wt% NaCl@2.16 // H2O@1) // D2O@1n
 => NaCl(H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁(D₂O)₁₂₂.₇₈₉₅₃₅₈₈₉₁₄₅₈₅ @ 1.10

*** NaCl(H2O)29.1966(D2O)122.794@1.10i
 => NaCl(H₂O)₂₉.₁₉₆₆(D₂O)₁₂₂.₇₉₄ @ 1.10

*** 5g NaCl // 50mL H2O@1
 => NaCl(H₂O)₃₂.₄₃₉₅₀₅₅₆₇₅₈₂₅₇

*** 5g NaCl@2.16 // 50mL H2O@1
 => NaCl(H₂O)₃₂.₄₃₉₅₀₅₅₆₇₅₈₂₅₇ @ 1.05

!!!  5g NaCl // 50mL H2O   # Need density for H2O to convert volume to mass
Expected end of text, found '5g'  (at char 1), (line:1, col:2)

*** (10 wt% NaCl // H2O)@1.07n # set density of a mixture
 => NaCl(H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁ @ 1.07

*** 50 mL (45 mL H2O@1 // 5 g NaCl)@1.0707 // 20 mL D2O@1n
 => (H₂O)₂₉.₁₉₅₅₅₅₀₁₀₈₂₄₃₁NaCl(D₂O)₁₂.₁₁₈₉₈₉₆₅₈₁₉₈₄₀₂ @ 1.08

*** 1 cm Si // 5 nm Cr // 10 nm Au
 => Si₁₁₉₉₉₂₂.₉₇₃₇₄₆₂₄₉₅CrAu₁.₄₁₇₂₁₈₀₀₉₃₁₂₁₃₆ @ 2.33

*** aa:RELEELNVPGEIVESLSSSEESITRINKKIEKFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPIPNSLPQNIPPLTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPFTESQSLTLTDVENLHLPLPLLQSWMHQPHQPLPPTVMFPPQSVLSLSQSKVLPVPQKAVPYPQRDMPIQAFLLYQEPVLGPVRGPFPIIV
 => C₁₀₈₀H₁₃₇₀¹H₃₁₉N₂₆₈O₃₁₀S₆ @ 1.27

!!!  Bl2Oh   # Bad symbol
unknown element Bl

!!!  5 Mg NaCl // 50mL H2O@1  # Bad units
Expected end of text, found '5'  (at char 1), (line:1, col:2)

!!!  4 nm NaCl@2.17// 50 g Si  # Can't use mass in layer mixture
Expected end of text, found '4'  (at char 1), (line:1, col:2)

@pkienzle pkienzle changed the title Explore lark as replacement for the pyparsing formula parser Replace pyparsing with lark for better error messages May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant