fletcher · jasminlapalme · May 31, 2022
diff --git a/DevelopmentNotes/DevelopmentNotes.html b/DevelopmentNotes/DevelopmentNotes.html
@@ -26,7 +26,7 @@ <h4 id="whyanewversion">Why a New Version?</h4>
 <li><p>Accuracy &#8211; <abbr title="MultiMarkdown">MMD</abbr> v4 and v5 were the most accurate versions yet, and a lot of
 effort went into finding and resolving various edge cases. However, it began
 to feel like a game of whack-a-mole where new bugs would creep in every time I
-fixed an old one. The <a href="#gn:1" id="gnref:1" title="see glossary" class="glossary">PEG</a> began to feel rather convoluted in spots, even
+fixed an old one. The <a href="#gn_1" id="gnref_1" title="see glossary" class="glossary">PEG</a> began to feel rather convoluted in spots, even
 though it did allow for a precise (if not always accurate) specification of
 the grammar.</p></li>
 <li><p>Performance &#8211; &#8220;Back in the day&#8221; <a href="https://github.com/jgm/peg-markdown">peg-markdown</a> was one of the fastest
@@ -1142,8 +1142,8 @@ <h3 id="changelog">Changelog</h3>
 <hr />
 <ol>
 
-<li id="gn:1">
-PEG: <p>Parsing Expression Grammar <a href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</a> <a href="#gnref:1" title="return to body" class="reverseglossary">&#160;&#8617;&#xfe0e;</a></p>
+<li id="gn_1">
+PEG: <p>Parsing Expression Grammar <a href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</a> <a href="#gnref_1" title="return to body" class="reverseglossary">&#160;&#8617;&#xfe0e;</a></p>
 </li>
 
 </ol>

diff --git a/QuickStart/QuickStart.html b/QuickStart/QuickStart.html
@@ -59,15 +59,15 @@ <h3 id="performance">Performance</h3>
 
 <p>When developing <abbr title="MultiMarkdown">MMD</abbr> v6, one of my goals was to keep <abbr title="MultiMarkdown">MMD</abbr> at least in the ballpark of the fastest processors. Of course, being <em>the</em> fastest would be fantastic, but I was more concerned with ensuring that the code was easily understood, and easily updated with new features in the future.</p>
 
-<p><abbr title="MultiMarkdown">MMD</abbr> v3 &#8211; v5 used a <a href="#gn:1" id="gnref:1" title="see glossary" class="glossary">PEG</a> to handle the parsing. This made it easy to understand the relationship between the <abbr title="MultiMarkdown">MMD</abbr> grammar and the parsing code, since they were one and the same. However, the parsing code generated by the parsers was not particularly fast, and was prone to troublesome edge cases with terrible performance characteristics.</p>
+<p><abbr title="MultiMarkdown">MMD</abbr> v3 &#8211; v5 used a <a href="#gn_1" id="gnref_1" title="see glossary" class="glossary">PEG</a> to handle the parsing. This made it easy to understand the relationship between the <abbr title="MultiMarkdown">MMD</abbr> grammar and the parsing code, since they were one and the same. However, the parsing code generated by the parsers was not particularly fast, and was prone to troublesome edge cases with terrible performance characteristics.</p>
 
 <p>The first step in <abbr title="MultiMarkdown">MMD</abbr> v6 parsing is to break the source text into a series of tokens, which may consist of plain text, whitespace, or special characters such as &#8216;*&#8217;, &#8216;[&#8217;, etc. This chain of tokens is then used to perform the actual parsing.</p>
 
 <p><abbr title="MultiMarkdown">MMD</abbr> v6 divides the parsing into two separate phases, which actually fits more with Markdown&#8217;s design philosophically.</p>
 
 <ol>
 <li><p>Block parsing consists of identifying the &#8220;type&#8221; of each line of the source text, and grouping the lines into blocks (e.g. paragraphs, lists, blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and others can be many lines long. The block parsing in <abbr title="MultiMarkdown">MMD</abbr> v6 is handled by a parser generated by <a href="http://www.hwaci.com/sw/lemon/">lemon</a>. This parser allows the block structure to be more readily understood by non-programmers, but the generated parser is still fast.</p></li>
-<li><p>Span parsing consists of identifying Markdown/<abbr title="MultiMarkdown">MMD</abbr> structures that occur inside of blocks, such as links, images, strong, emph, etc. Most of these structures require matching pairs of tokens to specify where the span starts and where it ends. Most of these spans allow arbitrary levels of nesting as well. This made parsing them correctly in the <a href="#gn:1" title="see glossary" class="glossary">PEG</a>-based code difficult and slow. <abbr title="MultiMarkdown">MMD</abbr> v6 uses a different approach that is accurate and has good performance characteristics even with edge cases. Basically, it keeps a stack of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221; token is found, it is paired with the most recent appropriate opener on the stack. Any tokens in between the opener and closer are removed, as they are not able to be matched any more. To avoid unnecessary searches for non- existent openers, the parser keeps track of which opening tokens have been discovered. This allows the parser to continue moving forwards without having to go backwards and re-parse any previously visited tokens.</p></li>
+<li><p>Span parsing consists of identifying Markdown/<abbr title="MultiMarkdown">MMD</abbr> structures that occur inside of blocks, such as links, images, strong, emph, etc. Most of these structures require matching pairs of tokens to specify where the span starts and where it ends. Most of these spans allow arbitrary levels of nesting as well. This made parsing them correctly in the <a href="#gn_1" title="see glossary" class="glossary">PEG</a>-based code difficult and slow. <abbr title="MultiMarkdown">MMD</abbr> v6 uses a different approach that is accurate and has good performance characteristics even with edge cases. Basically, it keeps a stack of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221; token is found, it is paired with the most recent appropriate opener on the stack. Any tokens in between the opener and closer are removed, as they are not able to be matched any more. To avoid unnecessary searches for non- existent openers, the parser keeps track of which opening tokens have been discovered. This allows the parser to continue moving forwards without having to go backwards and re-parse any previously visited tokens.</p></li>
 </ol>
 
 <p>The result of this redesigned <abbr title="MultiMarkdown">MMD</abbr> parser is that it can parse short documents more quickly than <a href="http://commonmark.org/">CommonMark</a>, and takes only 15% &#8211; 20% longer to parse long documents. I have not delved too deeply into this, but I presume that CommonMark has a bit more &#8220;set-up&#8221; time that becomes expensive when parsing a short document (e.g. a paragraph or two). But this cost becomes negligible when parsing longer documents (e.g. file sizes of 1 MB). So depending on your use case, CommonMark may well be faster than <abbr title="MultiMarkdown">MMD</abbr>, but we&#8217;re talking about splitting hairs here&#8230;. Recent comparisons show <abbr title="MultiMarkdown">MMD</abbr> v6 taking approximately 4.37 seconds to parse a 108 MB file (approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same file (29.2 MB/second). For comparison, <abbr title="MultiMarkdown">MMD</abbr> v5.4 took approximately 94 second for the same file (1.15 MB/second).</p>
@@ -91,7 +91,7 @@ <h3 id="parsetree">Parse Tree</h3>
 <li><p>Use the resulting token tree for your own purposes.</p></li>
 </ol>
 
-<p>The token tree (<a href="#gn:2" id="gnref:2" title="see glossary" class="glossary">AST</a>) includes starting offsets and length of each token, allowing you to use <abbr title="MultiMarkdown">MMD</abbr> as part of a syntax highlighter. <abbr title="MultiMarkdown">MMD</abbr> v5 did not have this functionality in the public version, in part because the <a href="#gn:1" title="see glossary" class="glossary">PEG</a> parsers used did not provide reliable offset positions, requiring a great deal of effort when I adapted <abbr title="MultiMarkdown">MMD</abbr> for use in <a href="http://multimarkdown.com/">MultiMarkdown Composer</a>.</p>
+<p>The token tree (<a href="#gn_2" id="gnref_2" title="see glossary" class="glossary">AST</a>) includes starting offsets and length of each token, allowing you to use <abbr title="MultiMarkdown">MMD</abbr> as part of a syntax highlighter. <abbr title="MultiMarkdown">MMD</abbr> v5 did not have this functionality in the public version, in part because the <a href="#gn_1" title="see glossary" class="glossary">PEG</a> parsers used did not provide reliable offset positions, requiring a great deal of effort when I adapted <abbr title="MultiMarkdown">MMD</abbr> for use in <a href="http://multimarkdown.com/">MultiMarkdown Composer</a>.</p>
 
 <p>These steps are managed using the <code>mmd_engine</code> &#8220;object&#8221;. An individual <code>mmd_engine</code> cannot be used by multiple threads simultaneously, so if libMultiMarkdown is to be used in a multithreaded program, a separate <code>mmd_engine</code> should be created for each thread. Alternatively, just use the slightly more abstracted <code>mmd_convert_string()</code> function that handles creating and destroying the <code>mmd_engine</code> automatically.</p>
 
@@ -160,7 +160,7 @@ <h4 id="footnotes">Footnotes</h4>
 
 <h4 id="glossaryterms">Glossary Terms</h4>
 
-<p>If there are terms in your document you wish to define in a <a href="#gn:3" id="gnref:3" title="see glossary" class="glossary">glossary</a> at the end of your document, you can define them using the glossary syntax.</p>
+<p>If there are terms in your document you wish to define in a <a href="#gn_3" id="gnref_3" title="see glossary" class="glossary">glossary</a> at the end of your document, you can define them using the glossary syntax.</p>
 
 <p>Glossary terms can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the <code>?</code>.</p>
 
@@ -410,16 +410,16 @@ <h3 id="futuresteps">Future Steps</h3>
 <hr />
 <ol>
 
-<li id="gn:1">
-PEG: <p>Parsing Expression Grammar <a href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</a> <a href="#gnref:1" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
+<li id="gn_1">
+PEG: <p>Parsing Expression Grammar <a href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</a> <a href="#gnref_1" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
 </li>
 
-<li id="gn:2">
-AST: <p>Abstract Syntax Tree <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">https://en.wikipedia.org/wiki/Abstract_syntax_tree</a> <a href="#gnref:2" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
+<li id="gn_2">
+AST: <p>Abstract Syntax Tree <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">https://en.wikipedia.org/wiki/Abstract_syntax_tree</a> <a href="#gnref_2" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
 </li>
 
-<li id="gn:3">
-glossary: <p>The glossary collects information about important terms used in your document <a href="#gnref:3" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
+<li id="gn_3">
+glossary: <p>The glossary collects information about important terms used in your document <a href="#gnref_3" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
 </li>
 
 </ol>

diff --git a/src/html.c b/src/html.c
@@ -838,7 +838,7 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 				scratch->footnote_para_counter--;
 
 				if (scratch->footnote_para_counter == 0) {
-					printf(" <a href=\"#cnref:%d\" title=\"%s\" class=\"reversecitation\">&#160;&#8617;&#xfe0e;</a>", scratch->citation_being_printed, LC("return to body"));
+					printf(" <a href=\"#cnref_%d\" title=\"%s\" class=\"reversecitation\">&#160;&#8617;&#xfe0e;</a>", scratch->citation_being_printed, LC("return to body"));
 				}
 			}
 
@@ -853,15 +853,15 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 						temp_short = rand() % 32000 + 1;
 					}
 
-					printf(" <a href=\"#fnref:%d\" title=\"%s\" class=\"reversefootnote\">&#160;&#8617;&#xfe0e;</a>", temp_short, LC("return to body"));
+					printf(" <a href=\"#fnref_%d\" title=\"%s\" class=\"reversefootnote\">&#160;&#8617;&#xfe0e;</a>", temp_short, LC("return to body"));
 				}
 			}
 
 			if (scratch->glossary_being_printed) {
 				scratch->footnote_para_counter--;
 
 				if (scratch->footnote_para_counter == 0) {
-					printf(" <a href=\"#gnref:%d\" title=\"%s\" class=\"reverseglossary\">&#160;&#8617;&#xfe0e;</a>", scratch->glossary_being_printed, LC("return to body"));
+					printf(" <a href=\"#gnref_%d\" title=\"%s\" class=\"reverseglossary\">&#160;&#8617;&#xfe0e;</a>", scratch->glossary_being_printed, LC("return to body"));
 				}
 			}
 
@@ -1575,23 +1575,23 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 
 						if (temp_short2 == scratch->used_citations->size) {
 							// This is a re-use of a previously used note
-							printf("<a href=\"#cn:%d\" title=\"%s\" class=\"citation\">(%d)</a>",
+							printf("<a href=\"#cn_%d\" title=\"%s\" class=\"citation\">(%d)</a>",
 								   temp_short, LC("see citation"), temp_short);
 						} else {
 							// This is the first time this note was used
-							printf("<a href=\"#cn:%d\" id=\"cnref:%d\" title=\"%s\" class=\"citation\">(%d)</a>",
+							printf("<a href=\"#cn_%d\" id=\"cnref_%d\" title=\"%s\" class=\"citation\">(%d)</a>",
 								   temp_short, temp_short, LC("see citation"), temp_short);
 						}
 					} else {
 						// Locator present
 
 						if (temp_short2 == scratch->used_citations->size) {
 							// This is a re-use of a previously used note
-							printf("<a href=\"#cn:%d\" title=\"%s\" class=\"citation\">(%s, %d)</a>",
+							printf("<a href=\"#cn_%d\" title=\"%s\" class=\"citation\">(%s, %d)</a>",
 								   temp_short, LC("see citation"), temp_char, temp_short);
 						} else {
 							// This is the first time this note was used
-							printf("<a href=\"#cn:%d\" id=\"cnref:%d\" title=\"%s\" class=\"citation\">(%s, %d)</a>",
+							printf("<a href=\"#cn_%d\" id=\"cnref_%d\" title=\"%s\" class=\"citation\">(%s, %d)</a>",
 								   temp_short, temp_short, LC("see citation"), temp_char, temp_short);
 						}
 					}
@@ -1638,7 +1638,7 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 						temp_short3 = temp_short;
 					}
 
-					printf("<a href=\"#fn:%d\" title=\"%s\" class=\"footnote\"><sup>%d</sup></a>",
+					printf("<a href=\"#fn_%d\" title=\"%s\" class=\"footnote\"><sup>%d</sup></a>",
 						   temp_short3, LC("see footnote"), temp_short);
 				} else {
 					// This is the first time this note was used
@@ -1650,7 +1650,7 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 						temp_short3 = temp_short;
 					}
 
-					printf("<a href=\"#fn:%d\" id=\"fnref:%d\" title=\"%s\" class=\"footnote\"><sup>%d</sup></a>",
+					printf("<a href=\"#fn_%d\" id=\"fnref_%d\" title=\"%s\" class=\"footnote\"><sup>%d</sup></a>",
 						   temp_short3, temp_short3, LC("see footnote"), temp_short);
 				}
 			} else {
@@ -1690,15 +1690,15 @@ void mmd_export_token_html(DString * out, const char * source, token * t, scratc
 				if (temp_short2 == scratch->used_glossaries->size) {
 					// This is a re-use of a previously used note
 
-					printf("<a href=\"#gn:%d\" title=\"%s\" class=\"glossary\">",
+					printf("<a href=\"#gn_%d\" title=\"%s\" class=\"glossary\">",
 						   temp_short, LC("see glossary"));
 					mmd_print_string_html(out, temp_note->clean_text, false, true);
 					print_const("</a>");
 				} else {
 					// This is the first time this note was used
 
 
-					printf("<a href=\"#gn:%d\" id=\"gnref:%d\" title=\"%s\" class=\"glossary\">",
+					printf("<a href=\"#gn_%d\" id=\"gnref_%d\" title=\"%s\" class=\"glossary\">",
 						   temp_short, temp_short, LC("see glossary"));
 					mmd_print_string_html(out, temp_note->clean_text, false, true);
 					print_const("</a>");
@@ -2479,7 +2479,7 @@ void mmd_export_footnote_list_html(DString * out, const char * source, scratch_p
 			// Export footnote
 			pad(out, 2, scratch);
 
-			printf("<li id=\"fn:%d\">\n", i + 1);
+			printf("<li id=\"fn_%d\">\n", i + 1);
 			scratch->padded = 6;
 
 			note = stack_peek_index(scratch->used_footnotes, i);
@@ -2527,7 +2527,7 @@ void mmd_export_glossary_list_html(DString * out, const char * source, scratch_p
 			// Export glossary
 			pad(out, 2, scratch);
 
-			printf("<li id=\"gn:%d\">\n", i + 1);
+			printf("<li id=\"gn_%d\">\n", i + 1);
 			scratch->padded = 6;
 
 			note = stack_peek_index(scratch->used_glossaries, i);
@@ -2580,7 +2580,7 @@ void mmd_export_citation_list_html(DString * out, const char * source, scratch_p
 			// Export footnote
 			pad(out, 2, scratch);
 
-			printf("<li id=\"cn:%d\">\n", i + 1);
+			printf("<li id=\"cn_%d\">\n", i + 1);
 			scratch->padded = 6;
 
 			note = stack_peek_index(scratch->used_citations, i);

diff --git a/tests/MMD6Tests/Abbreviations.html b/tests/MMD6Tests/Abbreviations.html
@@ -80,9 +80,9 @@ <h1 id="foobar">foo bar</h1>
 
 <p>20</p>
 
-<p><a href="#fn:1" id="fnref:1" title="see footnote" class="footnote"><sup>1</sup></a></p>
+<p><a href="#fn_1" id="fnref_1" title="see footnote" class="footnote"><sup>1</sup></a></p>
 
-<p><a href="#fn:2" id="fnref:2" title="see footnote" class="footnote"><sup>2</sup></a></p>
+<p><a href="#fn_2" id="fnref_2" title="see footnote" class="footnote"><sup>2</sup></a></p>
 
 <ul>
 <li><abbr title="FOO">foo</abbr></li>
@@ -94,12 +94,12 @@ <h1 id="foobar">foo bar</h1>
 <hr />
 <ol>
 
-<li id="fn:1">
-<p><abbr title="FOO">foo</abbr> and <abbr title="BAR">bar</abbr> <a href="#fnref:1" title="return to body" class="reversefootnote">&#160;&#8617;&#xfe0e;</a></p>
+<li id="fn_1">
+<p><abbr title="FOO">foo</abbr> and <abbr title="BAR">bar</abbr> <a href="#fnref_1" title="return to body" class="reversefootnote">&#160;&#8617;&#xfe0e;</a></p>
 </li>
 
-<li id="fn:2">
-<p>foo and bar <a href="#fnref:2" title="return to body" class="reversefootnote">&#160;&#8617;&#xfe0e;</a></p>
+<li id="fn_2">
+<p>foo and bar <a href="#fnref_2" title="return to body" class="reversefootnote">&#160;&#8617;&#xfe0e;</a></p>
 </li>
 
 </ol>