Fixes to RMG Cantera yaml writers to better match ck2yaml by djlucey · Pull Request #2947 · ReactionMechanismGenerator/RMG-Py

djlucey · 2026-05-12T01:38:36Z

Motivation or Problem

The Cantera yaml output files written in RMG had some errors when loaded into Cantera.

The cantera1 writer defines a Cantera reaction object by its products and reactants, and the equation that is then written to the yaml and used to load back into Cantera end up duplicating third body type surface species for certain reactions. It is described in this Cantera issue 2115.
The cantera2 writer is missing X as an element and gives CanteraError thrown by Phase::addSpecies: Species 'vacantX(3)' contains an undefined element 'X'. and also does not specify the sites correctly, giving and error Number of surface sites not balanced in reaction HOCOXX(73) <=> OX(6) + XCOH(54).

Once fixes were applied to load mechanisms into Cantera without errors, the comparison file generated at the end of the RMG run revealed some differences.

It identified both methods as missing the transport note entry in species definition (which is present in ck2yaml)
Both methods were missing "state" for surface phase (and for gas phase in cantera2 only).
The ordering of the cantera2 reactions was not consistent with ck2yaml, which separates them into surface and gas reactions, so the comparison method is also limited without applying this separation due to the dependence on the order of the reactions in the file.

Description of Changes

The "state" field was populated in each place it was missing for the rmg writer but present in ck2yaml. The species transport note had been turned off for the non-annotated yaml files, so this condition was removed to be consistent with ck2yaml. The gas and surface reactions were separated for surface mechanisms for the cantera2 method, which enabled the comparison method. The cantera1 logic was modified for use in cantera2 to specify custom elements. For surface species, "sites" was populated in cantera2 by counting the X atoms. To combat the duplication of species from Cantera's reaction definition, the equation was composed manually from reactants and products.

Testing

I was using this input file for methane partial oxidation: input.py which writes Cantera yaml for both methods. The comparison text files generated were used to identify differences between ck2yaml method and the RMG yaml writers. Claude was used to help make some tests that the missing fields were present and that generated yamls agree with the ck2yaml method. The new and existing tests seem to be working for me as far as I can tell.

…misidentification Cantera's Python API, when constructing ct.Reaction from reactants/products dicts, silently re-interprets any species with net-zero stoichiometry (equal count on both sides — a spectator or surface catalyst) as a third-body collider. This mutates input_data: the equation string has the spectator's stoichiometry doubled and a spurious 'efficiencies' entry is added. The corrupted YAML cannot be round-tripped. This affects all rate types (ArrheniusRate, InterfaceArrheniusRate, StickingArrheniusRate, PlogRate, ChebyshevRate) and is reproducible on Cantera 3.1.0 and 3.2.0. A bug report has been filed with the Cantera project. Fix: replace the reactants/products dict form of ct.Reaction with the equation-string form for all non-third-body reaction types. Passing an equation string avoids the misidentification entirely. ThirdBody, Troe, and Lindemann reactions are left using the dict form because they require the third_body= keyword parameter, and their species do not appear on both sides so the bug does not affect them. A nested helper _ct_equation() is added inside to_cantera() to build the equation string from the already-computed ct_reactants / ct_products dicts. Also adds a regression test (test_reaction_to_dicts_surface_spectator_species) that verifies no 'efficiencies' key and no doubled stoichiometry appear for a SurfaceArrhenius reaction where the same surface species appears on both sides. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

defines base elements from existing list, but also includes custom elements and surface site as done in cantera1 method

Replaces the ad-hoc sum over atoms with the existing Molecule API method, and only emits the 'sites' field when a species occupies more than one site (monodentate adsorbates with sites=1 don't need the field). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tch ck2yaml

…annotated file The way things are set up currently, transport notes belong in annotated output only. The previous commit removed the verbose guards, to match the ck2yaml version. Yes, we want them consistent, but for now make it so the non-annotated yaml files have no comments/notes, in any version. Instead: restore if-verbose gating in both yaml_cantera1 and yaml_cantera2, and strip transport notes from cantera_from_ck/chem.yaml after ck2yaml generates it when verbose_comments is False, so all three non-annotated files are consistent. To remove them from the ck2yaml-generated file, we can use a regex to remove 'note:' lines and their indented continuations . Avoieds yaml.safe_load + yaml.dump (which reformats the whole file) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…res cantera2 writer to ck2yaml as is done for cantera1

…haviour Two tests were written against the original (ungated) transport note behaviour and against sites=1 being emitted for monodentate species: - Rename and invert the transport note tests: note must be absent when verbose=False and present when verbose=True. - Single-site surface species should have no 'sites' key (Cantera defaults to 1); only bidentate+ species emit the field. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

These files are generated on the fly by running the mainTest functional test. Committing them bakes in machine-specific paths and database-version-dependent thermo coefficients that drift over time. The comparison test already skips gracefully when the files are absent, so CI is unaffected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…d compares cantera2 writer to ck2yaml as is done for cantera1

Both Cantera YAML writers and the Chemkin writer were emitting a hardcoded periodic-table grab-bag (H, C, O, N, Ne, Ar, He, Si, S, F, Cl, Br, I plus D/T/CI/OI isotopes plus X) in every output file, regardless of what the model actually contained. This polluted the files and caused diffs against ck2yaml-converted reference outputs. Add ReactionModel.get_elements() that walks each species' molecule[0] atoms and returns the set of Element singletons in use. All three writers now derive their elements block from that set: built-in elements appear only when present; isotopes (D, T, CI, OI) appear only when an isotope atom is on some species; X appears only for surface models. Writer 2's is_plasma path still adds the E pseudo-element without iterating atoms. In chemkin.pyx save_chemkin, the union is computed once across all species before the surface/gas split so that the gas-only Chemkin file in a surface run still lists X (required for downstream ck2yaml conversion to recognize the surface site element). We could cache this list of elements, either updated once per save_all() call, or even just once after model initiation (since no chemistry can create an element that wasn't already there). But that is not done yet. (For simplicity) Side cleanups: drop the unused search_for_additional_elements branch in writer 2 (the new design subsumes it); replace writer 2's inline MixedModel with a real ReactionModel so .get_elements() works without duck typing; remove the module-import-time ELEMENTS_BLOCK/ELEMENTS_LINE globals in writer 1 in favor of per-call computation. Mock containers in the writer 2 tests now inherit ReactionModel for the same reason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The destination directories under test/rmgpy/test_data/yaml_writer_data/ (cantera1, cantera2, ck2yaml) are not guaranteed to exist on a fresh checkout: we recently removed the committed golden YAML files leaving the subdirectories empty. Git does not track empty directories, so CI failed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

djlucey requested a review from rwest May 12, 2026 01:38

djlucey force-pushed the yaml_fix branch from 6d94cbb to 021903d Compare May 12, 2026 13:15

rwest and others added 12 commits May 13, 2026 14:41

[yaml_cantera2] Use full elements list, like in antera1 method

70fb16c

defines base elements from existing list, but also includes custom elements and surface site as done in cantera1 method

counts surface sites and assigns the corresponding field

baeb6ee

adds transport note to each species, not just in annotated yaml to ma…

007028c

…tch ck2yaml

separates gas reactions and surface reactions to match ck2yaml

62ed6e7

adds 'state' to gas and surface phases to match ck2yaml

3de90ac

tests for entries and fields that are expected from ck2yaml and compa…

7dc2668

…res cantera2 writer to ck2yaml as is done for cantera1

yaml_fix generated recent files for comparison mainTest

1576083

rwest force-pushed the yaml_fix branch from 021903d to 58c8673 Compare May 13, 2026 18:41

rwest and others added 3 commits May 13, 2026 16:17

fixup! tests for entries and fields that are expected from ck2yaml an…

0742d2c

…d compares cantera2 writer to ck2yaml as is done for cantera1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes to RMG Cantera yaml writers to better match ck2yaml#2947

Fixes to RMG Cantera yaml writers to better match ck2yaml#2947
djlucey wants to merge 15 commits into
mainfrom
yaml_fix

djlucey commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

djlucey commented May 12, 2026

Motivation or Problem

Description of Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants