Skip to content

Commit e1fc282

Browse files
committed
Fix whitespacing issue in maps
1 parent 5d0cee3 commit e1fc282

15 files changed

+365
-349
lines changed

maps/bgnpcgn-arm-Armn-Latn-1981.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/
99
creation_date: 2013
1010
confirmation date: 2019-06
1111
description: |
12-
The BGN/PCGN system for Armenian was designed for use in romanizing
13-
names written in the Armenian alphabet. The Roman letters and letter
12+
The BGN/PCGN system for Armenian was designed for use in romanizing
13+
names written in the Armenian alphabet. The Roman letters and letter
1414
combinations shown as equivalents to the Armenian characters reflect
1515
the eastern variety of Armenian, i.e. the language spoken in the
1616
Republic of Armenia.
17-
17+
1818
notes:
1919
- The character ե should be romanized ye initially and after the vowel characters ա, ե, է, ը, ի, ո, ու and օ. In all other instances, it should be romanized e.
2020
- The character ո should be romanized vo initially except in the word ով, which should be roman- ized ov. In all other instances, it should be romanized o.
@@ -23,7 +23,7 @@ notes:
2323
- The characters ԵՎ , եւ and եւ should be romanized yev initially, in isolation, and after the vowel characters ա, ե, է, ը, ի, ո, ու, and օ. In all other instances these characters should be romanized ev.
2424
- All apostrophes appearing in Armenian romanization are encoded Unicode 2019.
2525
- The Romanization column shows only lowercase forms but, when romanizing, uppercase and lowercase Roman letters as appropriate should be used.
26-
26+
2727
tests:
2828
- source:
2929
expected:
@@ -107,4 +107,3 @@ map:
107107
'\u0585' : 'o'
108108
'\u0586' : 'f'
109109
'\u0587' : 'ev' # yev initially
110-

maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/
99
creation_date: 2013
1010
confirmation date: 2019-06
1111
description: |
12-
This system reflects the Bulgarian national system officially adopted
13-
by state decree in March 2009. It was adopted by BGN and PCGN in 2013,
12+
This system reflects the Bulgarian national system officially adopted
13+
by state decree in March 2009. It was adopted by BGN and PCGN in 2013,
1414
replacing the BGN/PCGN system of 1952.
15-
15+
1616
notes:
1717
- When in final position, “ия” is romanized as “ia” (e.g., София = Sofia; София-Град= Sofia-Grad).
1818
- An exception to the romanization system is allowed for the name of the state. Thus, България is roman-

maps/bgnpcgn-per-Arab-Latn-1956.yaml

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,27 +11,27 @@ url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/
1111
creation_date: 1947
1212
confirmation_date: 2019-06
1313
description: |
14-
This System was adopted by the BGN in 1946 and by the PCGN in 1956
15-
and is applied in the systematic romanization of geographic names in
16-
Bahrain, Egypt, Iraq, Jordan, Kuwait, Libya, Oman, Qatar, Saudi Arabia,
14+
This System was adopted by the BGN in 1946 and by the PCGN in 1956
15+
and is applied in the systematic romanization of geographic names in
16+
Bahrain, Egypt, Iraq, Jordan, Kuwait, Libya, Oman, Qatar, Saudi Arabia,
1717
Syria, the United Arab Emirates, and Yemen.
1818
19-
Uniform results in the romanization of Arabic are difficult to
20-
obtain, since vowel points and diacritical marks are generally omitted
21-
from both manual and machine writing. It follows that for correct
22-
identification of the words which appear in any particular name,
23-
knowledge of its standard Arabic- script spelling including proper
24-
pointing, and recognition of dialectal and idiosyncratic deviations are
19+
Uniform results in the romanization of Arabic are difficult to
20+
obtain, since vowel points and diacritical marks are generally omitted
21+
from both manual and machine writing. It follows that for correct
22+
identification of the words which appear in any particular name,
23+
knowledge of its standard Arabic- script spelling including proper
24+
pointing, and recognition of dialectal and idiosyncratic deviations are
2525
essential.
2626
27-
In order to bring about uniformity in the Roman-script spelling of
28-
geographic names in Arabic- language areas, the system is based insofar
29-
as possible on fully pointed modern standard Arabic. In the interest of
30-
clarity, vowel pointing has been applied to the examples below. Arabic
31-
is written from right to left, and does not make a distinction between
27+
In order to bring about uniformity in the Roman-script spelling of
28+
geographic names in Arabic- language areas, the system is based insofar
29+
as possible on fully pointed modern standard Arabic. In the interest of
30+
clarity, vowel pointing has been applied to the examples below. Arabic
31+
is written from right to left, and does not make a distinction between
3232
upper and lower case.
3333
34-
notes:
34+
notes:
3535
- The symbol ◌ represents any Arabic consonant character.
3636
- "Hamzah (ء) is written in Arabic in association with most instances of initial alif, except those which belong to the definite article al or which bear a maddah (see note 11). Hamzah is written above the alif if the accompanying short vowel is a fatḩah (َأ) or ḑammah (ُأ) and below the alif if the accompanying short vowel is a kasrah (ِإ). When the purpose is to indicate the presence of a glottal stop, hamzah is written over medial and final alif (أ), wāw (ؤ) and yā’ without dots (ئ). Hamzah following kasrah (◌ِ ) is written (ئ). Almost always the yā’ is in the initial or medial form and the dots are omitted: example: (بئر). Hamzah following ḑammah (◌ُ )is written (ؤ). Hamzah following a long vowel is written without a bearer and is positioned on the line of print like a regular character. The romanization of hamzah (’) should always be carefully distinguished from that of ‘ayn (‘)."
3737
- Alif as such is not romanized when it is a bearer of hamzah, but see fatḩah alif (ا◌َ ) and alif maddah .in the vowel table. See also note 2 and 11 above ) آ (

maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -12,42 +12,42 @@ description: |
1212
The BGN/PCGN system for Russian was adopted by the BGN in 1944 and by the
1313
PCGN in 1947 for use in romanizing names written in the Russian Cyrillic
1414
alphabet.
15-
15+
1616
notes:
17-
- The character e should be romanized ye initially, after the vowel
18-
characters a, e, ё, и, о, у, ы, э, ю, and я, and after й, ъ, and ь. In
17+
- The character e should be romanized ye initially, after the vowel
18+
characters a, e, ё, и, о, у, ы, э, ю, and я, and after й, ъ, and ь. In
1919
all other instances, it should be romanized e.
20-
21-
- The character ё is not considered a separate character of the
22-
Russian alphabet and the dieresis is generally not shown. When the
23-
dieresis is shown, the character should be romanized yë initially,
24-
after the vowel characters a, e, ё, и, о, у, ы, э, ю, and я, and after
25-
й, ъ, and ь. In all other instances, it should be romanized ё. When the
26-
dieresis is not shown, the character may still be romanized in the
20+
21+
- The character ё is not considered a separate character of the
22+
Russian alphabet and the dieresis is generally not shown. When the
23+
dieresis is shown, the character should be romanized yë initially,
24+
after the vowel characters a, e, ё, и, о, у, ы, э, ю, and я, and after
25+
й, ъ, and ь. In all other instances, it should be romanized ё. When the
26+
dieresis is not shown, the character may still be romanized in the
2727
preceding manner or, alternatively, in accordance with note 1.
28-
28+
2929
- |
30-
Unusual Russian character sequences occurring primarily in
31-
non-Russian-language names may be romanized as shown below in order to
32-
provide differentiation from regularly- occurring digraphs and
30+
Unusual Russian character sequences occurring primarily in
31+
non-Russian-language names may be romanized as shown below in order to
32+
provide differentiation from regularly- occurring digraphs and
3333
character sequences.
3434
й before а, у, ы, or э у·
3535
ы before а, у, ы, or э у·
3636
ы after any vowel character ·у
3737
э after any consonant character except й ·е
3838
TC t·s шч sh·ch
39-
40-
- Unicode values are shown with the uppercase Cyrillic character first,
39+
40+
- Unicode values are shown with the uppercase Cyrillic character first,
4141
followed by the lowercase character.
4242

4343
- |
44-
An inventory of letter-diacritic combinations, with their Unicode
45-
encoding, in addition to the unmodified letters of the basic Roman
44+
An inventory of letter-diacritic combinations, with their Unicode
45+
encoding, in addition to the unmodified letters of the basic Roman
4646
script is:
4747
All apostrophes appearing in romanization are U+2019 Ë (U+00CB)
4848
ë (U+00EB)
49-
- The Romanization column shows only lowercase forms but, when
50-
romanizing, uppercase and lowercase Roman letters as appropriate should
49+
- The Romanization column shows only lowercase forms but, when
50+
romanizing, uppercase and lowercase Roman letters as appropriate should
5151
be used.
5252

5353
tests:

maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/
99
creation_date: 1947
1010
confirmation_date: 2019-06
1111
description: |
12-
The BGN/PCGN system for Ukrainian was designed for use in romanizing
13-
names written in the Ukrainian alphabet. The Ukrainian alphabet
14-
contains five characters not present in the Russian alphabet: ґ, є, і,
12+
The BGN/PCGN system for Ukrainian was designed for use in romanizing
13+
names written in the Ukrainian alphabet. The Ukrainian alphabet
14+
contains five characters not present in the Russian alphabet: ґ, є, і,
1515
ї, and ’.
1616
1717
notes:

maps/icao-bel-Cyrl-Latn-9303.yaml

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,21 @@ name: ICAO Doc 9303 Machine Readable Travel Documents Part 3 -- Belorussian Cyri
88
url: https://www.icao.int/publications/pages/publication.aspx?docnum=9303
99
creation_date: 2015
1010
description: |
11-
Part 3 defines specifications that are common to TD1, TD2 and TD3
12-
size machine readable travel documents (MRTDs) including those
13-
necessary for global interoperability using visual inspection and
14-
machine readable (optical character recognition) means.
11+
Part 3 defines specifications that are common to TD1, TD2 and TD3
12+
size machine readable travel documents (MRTDs) including those
13+
necessary for global interoperability using visual inspection and
14+
machine readable (optical character recognition) means.
1515
16-
Since only Latin-alphabet characters are allowed in the VIZ, if
17-
mandatory data elements are in a national language that does not use
18-
the Latin alphabet, a transcription or transliteration shall also be
16+
Since only Latin-alphabet characters are allowed in the VIZ, if
17+
mandatory data elements are in a national language that does not use
18+
the Latin alphabet, a transcription or transliteration shall also be
1919
provided.
2020
21-
This document defines the transliteration mappings used to produce
21+
This document defines the transliteration mappings used to produce
2222
this transcription or transliteration.
2323
tests:
24-
- source:
25-
expected:
24+
- source:
25+
expected:
2626
map:
2727
# https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf
2828
characters:

maps/icao-bul-Cyrl-Latn-9303.yaml

Lines changed: 56 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -8,70 +8,70 @@ name: ICAO Doc 9303 Machine Readable Travel Documents Part 3 -- Bulgarian Cyrill
88
url: https://www.icao.int/publications/pages/publication.aspx?docnum=9303
99
creation_date: 2015
1010
description: |
11-
Part 3 defines specifications that are common to TD1, TD2 and TD3
12-
size machine readable travel documents (MRTDs) including those
13-
necessary for global interoperability using visual inspection and
14-
machine readable (optical character recognition) means.
11+
Part 3 defines specifications that are common to TD1, TD2 and TD3
12+
size machine readable travel documents (MRTDs) including those
13+
necessary for global interoperability using visual inspection and
14+
machine readable (optical character recognition) means.
1515
16-
Since only Latin-alphabet characters are allowed in the VIZ, if
17-
mandatory data elements are in a national language that does not use
18-
the Latin alphabet, a transcription or transliteration shall also be
16+
Since only Latin-alphabet characters are allowed in the VIZ, if
17+
mandatory data elements are in a national language that does not use
18+
the Latin alphabet, a transcription or transliteration shall also be
1919
provided.
2020
21-
This document defines the transliteration mappings used to produce
21+
This document defines the transliteration mappings used to produce
2222
this transcription or transliteration.
2323
tests:
24-
- source:
25-
expected:
24+
- source:
25+
expected:
2626
map:
2727
# https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf
2828
characters:
29-
"\u0027": "" # '
30-
"\u0410": "A" # А
31-
"\u0411": "B" # Б
32-
"\u0414": "D" # Д
33-
"\u0401": "E" # Ё
34-
"\u0415": "E" # Е
35-
"\u042D": "E" # Э
36-
"\u0424": "F" # Ф
37-
"\u0413": "G" # Г
38-
"\u0418": "I" # И
39-
"\u0419": "I" # Й
40-
"\u041A": "K" # К
41-
"\u041B": "L" # Л
42-
"\u041C": "M" # М
43-
"\u041D": "N" # Н
44-
"\u041E": "O" # О
45-
"\u041F": "P" # П
46-
"\u0420": "R" # Р
47-
"\u0421": "S" # С
48-
"\u0422": "T" # Т
49-
"\u0423": "U" # У
50-
"\u0412": "V" # В
51-
"\u042B": "Y" # Ы
52-
"\u0417": "Z" # З
53-
"\u0427": "CH" # Ч
54-
"\u042F": "IA" # Я
55-
"\u042E": "IU" # Ю
56-
"\u0425": "KH" # Х
57-
"\u0428": "SH" # Ш
58-
"\u0429": "SHT" # Щ
59-
"\u0426": "TS" # Ц
60-
"\u0416": "ZH" # Ж
61-
"\u0490": "G" # Ґ
62-
"\u040E": "U" # Ў
63-
"\u046A": "U" # Ѫ
64-
"\u0402": "D" # Ђ
65-
"\u0405": "DZ" # Ѕ
66-
"\u0408": "J" # Ј
67-
"\u0409": "LJ" # Љ
68-
"\u040A": "NJ" # Њ
69-
"\u04BA": "C" # Һ
70-
"\u040F": "DZ" # Џ
71-
"\u0404": "IE" # Є
72-
"\u0407": "I" # Ї
73-
"\u0403": "G" # Ѓ
74-
"\u0027": "" # '
29+
"\u0027": "" # '
30+
"\u0410": "A" # А
31+
"\u0411": "B" # Б
32+
"\u0414": "D" # Д
33+
"\u0401": "E" # Ё
34+
"\u0415": "E" # Е
35+
"\u042D": "E" # Э
36+
"\u0424": "F" # Ф
37+
"\u0413": "G" # Г
38+
"\u0418": "I" # И
39+
"\u0419": "I" # Й
40+
"\u041A": "K" # К
41+
"\u041B": "L" # Л
42+
"\u041C": "M" # М
43+
"\u041D": "N" # Н
44+
"\u041E": "O" # О
45+
"\u041F": "P" # П
46+
"\u0420": "R" # Р
47+
"\u0421": "S" # С
48+
"\u0422": "T" # Т
49+
"\u0423": "U" # У
50+
"\u0412": "V" # В
51+
"\u042B": "Y" # Ы
52+
"\u0417": "Z" # З
53+
"\u0427": "CH" # Ч
54+
"\u042F": "IA" # Я
55+
"\u042E": "IU" # Ю
56+
"\u0425": "KH" # Х
57+
"\u0428": "SH" # Ш
58+
"\u0429": "SHT" # Щ
59+
"\u0426": "TS" # Ц
60+
"\u0416": "ZH" # Ж
61+
"\u0490": "G" # Ґ
62+
"\u040E": "U" # Ў
63+
"\u046A": "U" # Ѫ
64+
"\u0402": "D" # Ђ
65+
"\u0405": "DZ" # Ѕ
66+
"\u0408": "J" # Ј
67+
"\u0409": "LJ" # Љ
68+
"\u040A": "NJ" # Њ
69+
"\u04BA": "C" # Һ
70+
"\u040F": "DZ" # Џ
71+
"\u0404": "IE" # Є
72+
"\u0407": "I" # Ї
73+
"\u0403": "G" # Ѓ
74+
"\u0027": "" # '
7575

7676
"\u0430": "a" # а
7777
"\u0431": "b" # б

0 commit comments

Comments
 (0)