Plain text is rarely sufficient. This leads to a plethora of incompatible rich text formats. We use here our own AST-based rich text system based on Markdown, with extensions based on MathML.
A schema for the AST can be found in ./types.ts. Examples can be found in ./test_conversion.ts and ../../server/markdown/test_rich.ts.
The MathML subsystem is flexible, but the type schema is quite loose. This is better discussed and explained in ./mathml.ts, which features several tools for easily building rudimentary MathML expressions. The latter module is used, for example, in the UnivariatePolynomial and Spline classes, as well as for denoting computational compexity in the ComplexityMathMLHelper class.
Long story short, the code at ../../server/markdown parses markdown via CommonMark and later converts the obtained AST into our rich text AST format.
The website previously used a custom Markdown parser. You can find a working one here and an improved but unfinished one here.
There are two reasons why I decided to abandon that.
-
Markdown has a surprising amount nuances, as can be seen in the CommonMark Spec. These make all parsers incompatible with each other, and necessarily leads to choices. For example, indented text may be a code block, but may just as well be a nested list. The custom parser was more of a burden than a good learning experience.
-
At some point I started thinking about setting up a blog, and being able to write mathematical expressions was a necessity. Thus, I was forced either to introduce my own Markdown extensions, or to abandon my parser altogether in favor of another approach.
I made the decision to develop on AST-based system that is flexible enough to handle both Markdown and HTML+MathML.
Another format that is partially supported is LaTeX via TeX4ht, which can be parsed by the code at ../../server/html. The latter acts as a general HTML parser (used for parsing .README_{lang}.html files in the server's file service), but only supports a Markdown-like limited subset of HTML, extended with MathML.
If that turns out easy enough, I may also experiment at some point with plasTeX or even possibly typst.
We support converting some part of the AST back to Markdown-based plain text in ./conversion.ts.