Gurmukhi transliteration: addressed overapplication of virama, normalized nukta characters#698
Gurmukhi transliteration: addressed overapplication of virama, normalized nukta characters#698bgo-eiu wants to merge 3 commits intowikimedia:masterfrom
Conversation
|
Can you please check for failing tests and update the pull request? |
|
Yes thank you for pointing that out. I will update the tests when I get a chance |
|
There is a block here actually which becomes a problem - Wikipedia applies NFC normalization to Gurmukhi which changes the nukta characters to their legacy decomposed forms. This breaks a number of URLs to Punjabi external links which have characters like ਫ਼ in them. It also forces users to press backspace more than once to delete single letters, and can result in some typographic inconsistency. Wikimedia needs to support theses characters: ਫ਼ ਲ਼ ਸ਼ ਗ਼ ਖ਼ ਗ਼ without decomposing them |
@bgo-eiu Can you explain the 'typographical inconsistency' part? I am mostly curious. I thought NFC was not supposed to affect the |
|
Hi! I'm sorry that it took unreasonably long to look at it! Totally my bad. This generally looks good, but can you please update the test cases? |
Addresses https://phabricator.wikimedia.org/T91159
Additional: normalized the nukta characters by using the standalone unicode characters for them wherever possible rather than combining characters. Made 'q' kaka pair bindi because even though this is not that common, it is still more common than udaat, which I have changed to 'Q'. Added ways to type all the common nukta/bindi combinations.