Skip to content

fix(textkit): preserve empty glyph codepoints#3405

Open
brandon-julio-t wants to merge 1 commit intodiegomura:masterfrom
brandon-julio-t:fix-textkit-empty-glyph-codepoints
Open

fix(textkit): preserve empty glyph codepoints#3405
brandon-julio-t wants to merge 1 commit intodiegomura:masterfrom
brandon-julio-t:fix-textkit-empty-glyph-codepoints

Conversation

@brandon-julio-t
Copy link
Copy Markdown

@brandon-julio-t brandon-julio-t commented Apr 28, 2026

Summary

  • Normalize missing/empty glyph codePoints from the original run string before resolving textkit glyph/string indices.
  • Use existing non-empty glyph mappings as anchors so empty glyph mappings are only filled when the full sequence remains aligned and the missing range has a unique assignment.
  • Recompute repaired glyph metadata (isLigature, isMark) on cloned glyphs so downstream layout does not see stale fontkit cache metadata.
  • Add regression tests for leading CJK glyphs, repeated source code points, repaired ligature metadata, ambiguous consecutive empty glyphs, and partial-alignment bail out.

Fixes #3404.

Why Not Only Check for an Empty Array?

An empty glyph.codePoints array shows that the source-character mapping is missing, but it does not tell textkit which source code point or code points belong to that glyph.

A simple same-index repair would work for the minimal 洗水 case, but text shaping is not always one glyph per character. Ligatures, combining marks, glyphs with no source character, bidi/script shaping, repeated characters, adjacent empty mappings, and partial glyph runs can make glyph index and source-codepoint index diverge.

This patch uses existing non-empty glyph mappings as anchors and fills only unambiguous gaps between those anchors. One empty glyph can receive a multi-codepoint gap, but multiple empty glyphs are only repaired when each can receive exactly one code point. If the full glyph sequence no longer aligns with the original run string, or the gap is ambiguous, it stops repairing rather than guessing. Repaired glyphs are cloned so fontkit's cached glyph objects are not mutated.

Testing

  • npx yarn@1.22.19 install --frozen-lockfile
  • npx yarn@1.22.19 vitest packages/textkit/tests/layout/generateGlyphs.test.ts
  • npx yarn@1.22.19 workspace @react-pdf/textkit typecheck
  • npx yarn@1.22.19 vitest packages/textkit/tests
  • ./node_modules/.bin/prettier --check ".changeset/sharp-mugs-repair.md" "packages/textkit/src/layout/generateGlyphs.ts" "packages/textkit/tests/layout/generateGlyphs.test.ts"

AI Assistance

This PR was prepared with assistance from GPT-5.5 Fast (xhigh reasoning). I reviewed the repro, implementation, and test results before submitting.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 28, 2026

🦋 Changeset detected

Latest commit: c093c78

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 9 packages
Name Type
@react-pdf/textkit Patch
@react-pdf/layout Patch
@react-pdf/render Patch
@react-pdf/renderer Patch
@react-pdf/math Patch
@react-pdf/mermaid Patch
next-14 Patch
next-15 Patch
@react-pdf/vite-example Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@brandon-julio-t brandon-julio-t force-pushed the fix-textkit-empty-glyph-codepoints branch from 64ac337 to c093c78 Compare April 28, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@react-pdf/textkit drops source character mapping when fontkit returns a glyph with empty codePoints

1 participant