structured-concept:
char

concept-created: {2018-06-19}

description of char

description::
· char I define any UNIT of WRITTEN human and non-human-languages.
[hmnSgm.2018-06-19]

name::
* cpt.filMcsChar.last.html,
* cpt.dirLag/filMcsChar.last.html,
* cpt.char,
* cpt.character.language,
* cpt.language'character,

glyph of char

description::
· glyph-of-char[a] is ANY written or printed icon associated with the-char[a].
· some chars, eg computer-control-chars, have NO glyphs.
[hmnSgm.2018-07-11]

name::
* cpt.char'glyph,
* cpt.glyph-of-char,

glyph.FONT

description::
· font is A-SET of glyphs of a-set of chars with similar attributes.
[hmnSgm.2018-06-26]

name::
* cpt.char'font,
* cpt.char'glyph.font,
* cpt.font-of-char,

encoding of char

description::
· char-encoding is a-code of bits|bytes, computers use to represent chars.
· do not confuse the-code-points-of-chars, which are numbers, with the-encodings-of-chars which are codes of bits, especially when BOTH are-represented with hexadecimal-numbers.
[hmnSgm.2018-06-26]

name::
* cpt.char'encoding,
* cpt.encoding-of-char,

resource of char

name::
* cpt.char'resource,
* cpt.charAeResource,

addressWpg::
*

GENERIC of char

Generic-chain::
* written-language-attribute,
...
* entity,

char.SPECIFIC

name::
* cpt.char.specific,
* cpt.charAsSpecific,

specific::
* computer-char,
* escape-char,
* graphic-char,
* graphicNo-char,
* Html-char,
* Unicode-char,

char.COMPUTER

description::
· computer-char is any char used in computer-languages.
· they are units of human-texts AND units of formats of computer-texts.
[hmnSgm.2018-06-24]

name::
* cpt.char.computer,
* cpt.computer-char,

computer-char.SPECIFIC

specific::
* Unicode-char,

char.UNICODE

description::
· Unicode-char is any computer-char of the-Unicode-standard.

name::
* cpt.char.Unicode,
* cpt.Unicode'char,
* cpt.Unicode-char,

code-point of Unicode-char

description::
· code-point of a-Unicode-char[a] is a UNIQUE number, reserved by the-standard, used to assign the-char[a] in hexadecimal-format (U+FFFF with 4 to 6 digits in the-standard) or decimal-format.
· the-Unicode-standard reserves 65,536x17=1,114,112 code-points to assign chars.
[hmnSgm.2018-06-24]

name::
* cpt.char.Unicode'code-point,
* cpt.code-point--of--Unicode-char,
* cpt.Unicode'code-point,

code-point.CODE-SPACE

description::
· Unicode--code-space is the aggregate code-points of the-standard, 65,536x17=1,114,112, that reserves to assign chars.

name::
* cpt.char.Unicode'code-space,
* cpt.code-space--of-Unicode,
* cpt.Unicode'code-space,

code-point.PLANE

description::
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points.
There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh).
The very last code point in Unicode is the last code point in plane 16, U+10FFFF.
Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly-used characters.
The higher planes 1 through 16 are called "supplementary planes".
As of Unicode version 11.0, six of the planes have assigned code points (characters), and four are named.
[https://en.wikipedia.org/wiki/Plane_(Unicode), {2018-06-24}]

name::
* cpt.char.Unicode'plane,
* cpt.plane-of-Unicode,
* cpt.Unicode-char'plane,
* cpt.Unicode'plane,
* cpt.Unicode-plane,

plane.0 (BMP)

description::
The first plane, plane 0, the Basic Multilingual Plane (BMP) contains characters for almost all modern languages, and a large number of symbols. A primary objective for the BMP is to support the unification of prior character sets as well as characters for writing. Most of the assigned code points in the BMP are used to encode Chinese, Japanese, and Korean (CJK) characters.
The High Surrogates (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.
65,472 of the 65,536 code points in this plane have been allocated to a Unicode block, leaving just 64 code points in unallocated ranges (48 code points at 0870..089F and 16 code points at 2FE0..2FEF).
As of Unicode 11.0, the BMP comprises the following 163 blocks:
[https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane]

name::
* cpt.BMP-Basic-Multilingual-Plane--of-Unicode,
* cpt.char.Unicode'plane.0-BMP-Basic-Multilingual-Plane,
* cpt.Unicode'plane.0-BMP-Basic-Multilingual-Plane,

plane.1 (SMP)

description::
Plane 1, the Supplementary Multilingual Plane (SMP), contains historic scripts (except CJK ideographic), and symbols and notation used within certain fields. Scripts include Linear B, Egyptian hieroglyphs, and cuneiform scripts. It also includes English reform orthographies like Shavian and Deseret, and some modern scripts like Osage, Warang Citi, and Adlam. Symbols and notations include historic and modern musical notation; mathematical alphanumerics; Emoji and other pictographic sets; and game symbols for playing cards, Mah Jongg, and dominoes.
As of Unicode 11.0, the SMP comprises the following 118 blocks:
[https://en.wikipedia.org/wiki/Plane_(Unicode)#Supplementary_Multilingual_Plane]

name::
* cpt.char.Unicode'plane.1-SMP-Supplementary-Multilingual-Plane,
* cpt.SMP-Supplementary-Multilingual-Plane--of-Unicode,
* cpt.Unicode'plane.1-SMP-Supplementary-Multilingual-Plane,

plane.2 (SIP)

description::
Plane 2, the Supplementary Ideographic Plane (SIP), is used for CJK Ideographs, mostly CJK Unified Ideographs, that were not included in earlier character encoding standards.
As of Unicode 11.0, the SIP comprises the following six blocks:
* CJK Unified Ideographs Extension B (20000–2A6DF)
* CJK Unified Ideographs Extension C (2A700–2B73F)
* CJK Unified Ideographs Extension D (2B740–2B81F)
* CJK Unified Ideographs Extension E (2B820–2CEAF)
* CJK Unified Ideographs Extension F (2CEB0–2EBEF)
* CJK Compatibility Ideographs Supplement (2F800–2FA1F)
[https://en.wikipedia.org/wiki/Plane_(Unicode)#Supplementary_Ideographic_Plane]

name::
* cpt.char.Unicode'plane.2-SIP-Supplementary-Ideographic-Plane,
* cpt.SIP-Supplementary-Ideographic-Plane--of-Unicode,
* cpt.Unicode'plane.2-SIP-Supplementary-Ideographic-Plane,

plane.14 (SSP)

description::
Plane 14 (E in hexadecimal), the Supplementary Special-purpose Plane (SSP), currently contains non-graphical characters. The first block is for special use tag characters. The other block contains glyph variation selectors to indicate an alternate glyph for a character that cannot be determined by context.
As of Unicode 11.0, the SSP comprises the following two blocks:
* Tags (E0000–E007F)
* Variation Selectors Supplement (E0100–E01EF)

name::
* cpt.char.Unicode'plane.14-SSP-Supplementary-Special-purpose-Plane,
* cpt.SSP-Supplementary-Special-purpose-Plane--of-Unicode,
* cpt.Unicode'plane.14-SSP-Supplementary-Special-purpose-Plane,

plane.SUPPLEMENTARY

description::
· the-planes 1 through 16 are-called "supplementary-planes".

name::
* cpt.char.Unicode'plane.supplementary,
* cpt.supplementary-plane--of-Unicode,
* cpt.Unicode'plane.supplementary,

plane.UNASSIGNED

description::
Planes 3 to 13 (planes 3 to D in hexadecimal): No characters have yet been assigned to Planes 3 through 13. Plane 3 is tentatively named the Tertiary Ideographic Plane (TIP), but as of version 11.0 there are no characters assigned to it. It is reserved for Oracle Bone script, Bronze Script, Small Seal Script, additional CJK unified ideographs, supplement characters for existing scripts, and other historic ideographic scripts.
It is not anticipated that all these planes will be used in the foreseeable future, given the total sizes of the known writing systems left to be encoded. The number of possible symbol characters that could arise outside of the context of writing systems is potentially huge. At the moment, these 11 planes out of 17 are unused.
[https://en.wikipedia.org/wiki/Plane_(Unicode)#Unassigned_planes]

name::
* cpt.char.Unicode'plane.unassigned,
* cpt.unassigned-plane--of-Unicode,
* cpt.unused-plane--of-Unicode,
* cpt.Unicode'plane.unassigned,

plane.PRIVATE-USE-AREA

description::
The two planes 15 and 16 (planes F and 10 in hexadecimal), are designated as "private use planes". They contain blocks called Supplementary Private Use Area-A (PUA-A) and -B (PUA-B), Private Use Areas, which are available for character assignment by parties outside the ISO and the Unicode Consortium. They are used by fonts internally to refer to auxiliary glyphs, for example, ligatures and building blocks for other glyphs. Such characters will have limited interoperability. Software and fonts that support Unicode will not necessarily support character assignments by other parties.
[https://en.wikipedia.org/wiki/Plane_(Unicode)#Private_Use_Area_planes]

name::
* cpt.char.Unicode'plane.PUA-Private-Use-Area,
* cpt.PUA-Private-Use-Area-plane--of-Unicode,
* cpt.Unicode'plane.PUA-Private-Use-Area,

code-point.BLOCK

description::
A block is a uniquely named, contiguous range of code points.
It is identified by its first and last code point.
Blocks do not overlap.
A block may contain code points that are reserved, not-assigned etc.
Each character that is assigned, has a single "block name" value from the 291 names assigned as of Unicode version 11.0.
Unassigned code points outside of an existing block, have the default value "No_block".
[https://en.wikipedia.org/wiki/Unicode_character_property#Block {2018-06-26}]

name::
* cpt.block-of-Unicode,
* cpt.char.Unicode'block,
* cpt.Unicode'block,
* cpt.Unicode'char'block,
* cpt.Unicode-block,

plane of Unicode-block

description::
· every block is part of a-plane.

name::
* cpt.Unicode'block'plane,

code-point of Unicode-block

description::
· every block contains some code-points.
· there are 280,752 code-points in 291 blocks in Unicode.11-0-0.2018 and 137,439 chars assigned to them.
[hmnSgm.2018-06-26]

name::
* cpt.Unicode'block'code-point,

char of Unicode-block

description::
· from 280,752 code-points in 291 blocks in Unicode.11-0-0.2018 ONLY 137,439 are-assigned with chars.
[hmnSgm.2018-06-26]

name::
* cpt.Unicode'block'char,

Unicode-block.SPECIFIC

specific::
(code-points/chars)
  1. U+0000|0..U+007F|127, Basic-Latin, 128/128
  2. U+0080|128..U+00FF|255|ÿ, Latin-1-Supplement, 128/128
  3. U+0100|256|Ā..U+017F|383|ſ, Latin-Extended-A, 128/128
  4. U+0180|384|ƀ..U+024F|591|ɏ, Latin-Extended-B, 208/208
  5. U+0250|592|ɐ..U+02AF|687|ʯ, IPA-Extensions, 96/96
  6. U+02B0|688|ʰ..U+02FF|767|˿, Spacing-Modifier-Letters, 80/80
  7. U+0300|768|̀..U+036F|879|ͯ, Combining-Diacritical-Marks, 112/112
  8. U+0370|880|Ͱ..U+03FF|1023|Ͽ, Greek-and-Coptic, 144/135
  9. U+0400|1024|Ѐ..U+04FF|1279|ӿ, Cyrillic, 256/256
  10. U+0500|1280|Ԁ..U+052F|1327|ԯ, Cyrillic-Supplement, 48/48
  11. U+0530|1328|԰..U+058F|1423|֏, Armenian, 96/91
  12. U+0590|1424|֐..U+05FF|1535|׿, Hebrew, 112/88
  13. U+0600|1536|؀..U+06FF|1791|ۿ, Arabic, 256/255
  14. U+0700|1792|܀..U+074F|1871|ݏ, Syriac, 80/77
  15. U+0750|1872|ݐ..U+077F|1919|ݿ, Arabic-Supplement, 48/48
  16. U+0780|1920|ހ..U+07BF|1983|޿, Thaana, 64/50
  17. U+07C0|1984|߀..U+07FF|2047|߿, NKo, 64/62
  18. U+0800|2048|ࠀ..U+083F|2111|࠿, Samaritan, 64/61
  19. U+0840|2112|ࡀ..U+085F|2143|࡟, Mandaic, 32/29
  20. U+0860|2144|ࡠ..U+086F|2159|࡯, Syriac-Supplement, 16/11
  21. U+08A0|2208|ࢠ..U+08FF|2303|ࣿ, Arabic-Extended-A, 96/74
  22. U+0900|2304|ऀ..U+097F|2431|ॿ, Devanagari, 128/128
  23. U+0980|2432|ঀ..U+09FF|2559|৿, Bengali, 128/96
  24. U+0A00|2560|਀..U+0A7F|2687|੿, Gurmukhi, 128/80
  25. U+0A80|2688|઀..U+0AFF|2815|૿, Gujarati, 128/91
  26. U+0B00|2816|଀..U+0B7F|2943|୿, Oriya, 128/90
  27. U+0B80|2944|஀..U+0BFF|3071|௿, Tamil, 128/72
  28. U+0C00|3072|ఀ..U+0C7F|3199|౿, Telugu, 128/97
  29. U+0C80|3200|ಀ..U+0CFF|3327|೿, Kannada, 128/89
  30. U+0D00|3328|ഀ..U+0D7F|3455|ൿ, Malayalam, 128/117
  31. U+0D80|3456|඀..U+0DFF|3583|෿, Sinhala, 128/90
  32. U+0E00|3584|฀..U+0E7F|3711|๿, Thai, 128/87
  33. U+0E80|3712|຀..U+0EFF|3839|໿, Lao, 128/67
  34. U+0F00|3840|ༀ..U+0FFF|4095|࿿, Tibetan, 256/211
  35. U+1000|4096|က..U+109F|4255|႟, Myanmar, 160/160
  36. U+10A0|4256|Ⴀ..U+10FF|4351|ჿ, Georgian, 96/88
  37. U+1100|4352|ᄀ..U+11FF|4607|ᇿ, Hangul-Jamo, 256/256
  38. U+1200|4608|ሀ..U+137F|4991|፿, Ethiopic, 384/358
  39. U+1380|4992|ᎀ..U+139F|5023|᎟, Ethiopic-Supplement, 32/26
  40. U+13A0|5024|Ꭰ..U+13FF|5119|᏿, Cherokee, 96/92
  41. U+1400|5120|᐀..U+167F|5759|ᙿ, Unified-Canadian-Aboriginal-Syllabics, 640/640
  42. U+1680|5760| ..U+169F|5791|᚟, Ogham, 32/29
  43. U+16A0|5792|ᚠ..U+16FF|5887|᛿, Runic, 96/89
  44. U+1700|5888|ᜀ..U+171F|5919|ᜟ, Tagalog, 32/20
  45. U+1720|5920|ᜠ..U+173F|5951|᜿, Hanunoo, 32/23
  46. U+1740|5952|ᝀ..U+175F|5983|᝟, Buhid, 32/20
  47. U+1760|5984|ᝠ..U+177F|6015|᝿, Tagbanwa, 32/18
  48. U+1780|6016|ក..U+17FF|6143|៿, Khmer, 128/114
  49. U+1800|6144|᠀..U+18AF|6319|᢯, Mongolian, 176/157
  50. U+18B0|6320|ᢰ..U+18FF|6399|᣿, Unified-Canadian-Aboriginal-Syllabics-Extended, 80/70
  51. U+1900|6400|ᤀ..U+194F|6479|᥏, Limbu, 80/68
  52. U+1950|6480|ᥐ..U+197F|6527|᥿, Tai-Le, 48/35
  53. U+1980|6528|ᦀ..U+19DF|6623|᧟, New-Tai-Lue, 96/83
  54. U+19E0|6624|᧠..U+19FF|6655|᧿, Khmer-Symbols, 32/32
  55. U+1A00|6656|ᨀ..U+1A1F|6687|᨟, Buginese, 32/30
  56. U+1A20|6688|ᨠ..U+1AAF|6831|᪯, Tai-Tham, 144/127
  57. U+1AB0|6832|᪰..U+1AFF|6911|᫿, Combining-Diacritical-Marks-Extended, 80/15
  58. U+1B00|6912|ᬀ..U+1B7F|7039|᭿, Balinese, 128/121
  59. U+1B80|7040|ᮀ..U+1BBF|7103|ᮿ, Sundanese, 64/64
  60. U+1BC0|7104|ᯀ..U+1BFF|7167|᯿, Batak, 64/56
  61. U+1C00|7168|ᰀ..U+1C4F|7247|ᱏ, Lepcha, 80/74
  62. U+1C50|7248|᱐..U+1C7F|7295|᱿, Ol-Chiki, 48/48
  63. U+1C80|7296|ᲀ..U+1C8F|7311|᲏, Cyrillic-Extended-C, 16/9
  64. U+1C90|7312|Ა..U+1CBF|7359|Ჿ, Georgian-Extended, 48/46
  65. U+1CC0|7360|᳀..U+1CCF|7375|᳏, Sundanese-Supplement, 16/8
  66. U+1CD0|7376|᳐..U+1CFF|7423|᳿, Vedic-Extensions, 48/42
  67. U+1D00|7424|ᴀ..U+1D7F|7551|ᵿ, Phonetic-Extensions, 128/128
  68. U+1D80|7552|ᶀ..U+1DBF|7615|ᶿ, Phonetic-Extensions-Supplement, 64/64
  69. U+1DC0|7616|᷀..U+1DFF|7679|᷿, Combining-Diacritical-Marks-Supplement, 64/63
  70. U+1E00|7680|Ḁ..U+1EFF|7935|ỿ, Latin-Extended-Additional, 256/256
  71. U+1F00|7936|ἀ..U+1FFF|8191|῿, Greek-Extended, 256/233
  72. U+2000|8192| ..U+206F|8303|, General-Punctuation, 112/111
  73. U+2070|8304|⁰..U+209F|8351|₟, Superscripts-and-Subscripts, 48/42
  74. U+20A0|8352|₠..U+20CF|8399|⃏, Currency-Symbols, 48/32
  75. U+20D0|8400|⃐..U+20FF|8447|⃿, Combining-Diacritical-Marks-for-Symbols, 48/33
  76. U+2100|8448|℀..U+214F|8527|⅏, Letterlike-Symbols, 80/80
  77. U+2150|8528|⅐..U+218F|8591|↏, Number-Forms, 64/60
  78. U+2190|8592|←..U+21FF|8703|⇿, Arrows, 112/112
  79. U+2200|8704|∀..U+22FF|8959|⋿, Mathematical-Operators, 256/256
  80. U+2300|8960|⌀..U+23FF|9215|⏿, Miscellaneous-Technical, 256/256
  81. U+2400|9216|␀..U+243F|9279|␿, Control-Pictures, 64/39
  82. U+2440|9280|⑀..U+245F|9311|⑟, Optical-Character-Recognition, 32/11
  83. U+2460|9312|①..U+24FF|9471|⓿, Enclosed-Alphanumerics, 160/160
  84. U+2500|9472|─..U+257F|9599|╿, Box-Drawing, 128/128
  85. U+2580|9600|▀..U+259F|9631|▟, Block-Elements, 32/32
  86. U+25A0|9632|■..U+25FF|9727|◿, Geometric-Shapes, 96/96
  87. U+2600|9728|☀..U+26FF|9983|⛿, Miscellaneous-Symbols, 256/256
  88. U+2700|9984|✀..U+27BF|10175|➿, Dingbats, 192/192
  89. U+27C0|10176|⟀..U+27EF|10223|⟯, Miscellaneous-Mathematical-Symbols-A, 48/48
  90. U+27F0|10224|⟰..U+27FF|10239|⟿, Supplemental-Arrows-A, 16/16
  91. U+2800|10240|⠀..U+28FF|10495|⣿, Braille-Patterns, 256/256
  92. U+2900|10496|⤀..U+297F|10623|⥿, Supplemental-Arrows-B, 128/128
  93. U+2980|10624|⦀..U+29FF|10751|⧿, Miscellaneous-Mathematical-Symbols-B, 128/128
  94. U+2A00|10752|⨀..U+2AFF|11007|⫿, Supplemental-Mathematical-Operators, 256/256
  95. U+2B00|11008|⬀..U+2BFF|11263|⯿, Miscellaneous-Symbols-and-Arrows, 256/250
  96. U+2C00|11264|Ⰰ..U+2C5F|11359|ⱟ, Glagolitic, 96/94
  97. U+2C60|11360|Ⱡ..U+2C7F|11391|Ɀ, Latin-Extended-C, 32/32
  98. U+2C80|11392|Ⲁ..U+2CFF|11519|⳿, Coptic, 128/123
  99. U+2D00|11520|ⴀ..U+2D2F|11567|⴯, Georgian-Supplement, 48/40
  100. U+2D30|11568|ⴰ..U+2D7F|11647|⵿, Tifinagh, 80/59
  101. U+2D80|11648|ⶀ..U+2DDF|11743|⷟, Ethiopic-Extended, 96/79
  102. U+2DE0|11744|ⷠ..U+2DFF|11775|ⷿ, Cyrillic-Extended-A, 32/32
  103. U+2E00|11776|⸀..U+2E7F|11903|⹿, Supplemental-Punctuation, 128/79
  104. U+2E80|11904|⺀..U+2EFF|12031|⻿, CJK-Radicals-Supplement, 128/115
  105. U+2F00|12032|⼀..U+2FDF|12255|⿟, Kangxi-Radicals, 224/214
  106. U+2FF0|12272|⿰..U+2FFF|12287|⿿, Ideographic-Description-Characters, 16/12
  107. U+3000|12288| ..U+303F|12351|〿, CJK-Symbols-and-Punctuation, 64/64
  108. U+3040|12352|぀..U+309F|12447|ゟ, Hiragana, 96/93
  109. U+30A0|12448|゠..U+30FF|12543|ヿ, Katakana, 96/96
  110. U+3100|12544|㄀..U+312F|12591|ㄯ, Bopomofo, 48/43
  111. U+3130|12592|㄰..U+318F|12687|㆏, Hangul-Compatibility-Jamo, 96/94
  112. U+3190|12688|㆐..U+319F|12703|㆟, Kanbun, 16/16
  113. U+31A0|12704|ㆠ..U+31BF|12735|ㆿ, Bopomofo-Extended, 32/27
  114. U+31C0|12736|㇀..U+31EF|12783|㇯, CJK-Strokes, 48/36
  115. U+31F0|12784|ㇰ..U+31FF|12799|ㇿ, Katakana-Phonetic-Extensions, 16/16
  116. U+3200|12800|㈀..U+32FF|13055|㋿, Enclosed-CJK-Letters-and-Months, 256/254
  117. U+3300|13056|㌀..U+33FF|13311|㏿, CJK-Compatibility, 256/256
  118. (U+3400|13312|㐀..U+4DBF|19903|䶿, CJK-Unified-Ideographs-Extension-A, 6592/6582)
  119. U+4DC0|19904|䷀..U+4DFF|19967|䷿, Yijing-Hexagram-Symbols, 64/64
  120. (U+4E00|19968|一..U+9FFF|40959|鿿, CJK-Unified-Ideographs, 20992/20976)
  121. U+A000|40960|ꀀ..U+A48F|42127|꒏, Yi-Syllables, 1168/1165
  122. U+A490|42128|꒐..U+A4CF|42191|꓏, Yi-Radicals, 64/55
  123. U+A4D0|42192|ꓐ..U+A4FF|42239|꓿, Lisu, 48/48
  124. U+A500|42240|ꔀ..U+A63F|42559|꘿, Vai, 320/300
  125. U+A640|42560|Ꙁ..U+A69F|42655|ꚟ, Cyrillic-Extended-B, 96/96
  126. U+A6A0|42656|ꚠ..U+A6FF|42751|꛿, Bamum, 96/88
  127. U+A700|42752|꜀..U+A71F|42783|ꜟ, Modifier-Tone-Letters, 32/32
  128. U+A720|42784|꜠..U+A7FF|43007|ꟿ, Latin-Extended-D, 224/163
  129. U+A800|43008|ꠀ..U+A82F|43055|꠯, Syloti-Nagri, 48/44
  130. U+A830|43056|꠰..U+A83F|43071|꠿, Common-Indic-Number-Forms, 16/10
  131. U+A840|43072|ꡀ..U+A87F|43135|꡿, Phags-pa, 64/56
  132. U+A880|43136|ꢀ..U+A8DF|43231|꣟, Saurashtra, 96/82
  133. U+A8E0|43232|꣠..U+A8FF|43263|ꣿ, Devanagari-Extended, 32/32
  134. U+A900|43264|꤀..U+A92F|43311|꤯, Kayah-Li, 48/48
  135. U+A930|43312|ꤰ..U+A95F|43359|꥟, Rejang, 48/37
  136. U+A960|43360|ꥠ..U+A97F|43391|꥿, Hangul-Jamo-Extended-A, 32/29
  137. U+A980|43392|ꦀ..U+A9DF|43487|꧟, Javanese, 96/91
  138. U+A9E0|43488|ꧠ..U+A9FF|43519|꧿, Myanmar-Extended-B, 32/31
  139. U+AA00|43520|ꨀ..U+AA5F|43615|꩟, Cham, 96/83
  140. U+AA60|43616|ꩠ..U+AA7F|43647|ꩿ, Myanmar-Extended-A, 32/32
  141. U+AA80|43648|ꪀ..U+AADF|43743|꫟, Tai-Viet, 96/72
  142. U+AAE0|43744|ꫠ..U+AAFF|43775|꫿, Meetei-Mayek-Extensions, 32/23
  143. U+AB00|43776|꬀..U+AB2F|43823|꬯, Ethiopic-Extended-A, 48/32
  144. U+AB30|43824|ꬰ..U+AB6F|43887|꭯, Latin-Extended-E, 64/54
  145. U+AB70|43888|ꭰ..U+ABBF|43967|ꮿ, Cherokee-Supplement, 80/80
  146. U+ABC0|43968|ꯀ..U+ABFF|44031|꯿, Meetei-Mayek, 64/56
  147. (U+AC00|44032|가..U+D7AF|55215|힯, Hangul-Syllables, 11184/11172)
  148. U+D7B0|55216|ힰ..U+D7FF|55295|퟿, Hangul-Jamo-Extended-B, 80/72
  149. (U+D800|55296..U+DB7F|56191, High-Surrogates, 896/0)
  150. (U+DB80|56192..U+DBFF|56319, High-Private-Use-Surrogates, 128/0)
  151. (U+DC00|56320..U+DFFF|57343, Low-Surrogates, 1024/0)
  152. (U+E000|57344..U+F8FF|63743, Private-Use-Area, 6400/0)
  153. U+F900|63744|豈..U+FAFF|64255|﫿, CJK-Compatibility-Ideographs, 512/472
  154. U+FB00|64256|ff..U+FB4F|64335|ﭏ, Alphabetic-Presentation-Forms, 80/58
  155. U+FB50|64336|ﭐ..U+FDFF|65023|﷿, Arabic-Presentation-Forms-A, 688/611
  156. U+FE00|65024|︀..U+FE0F|65039|️, Variation-Selectors, 16/16
  157. U+FE10|65040|︐..U+FE1F|65055|︟, Vertical-Forms, 16/10
  158. U+FE20|65056|︠..U+FE2F|65071|︯, Combining-Half-Marks, 16/16
  159. U+FE30|65072|︰..U+FE4F|65103|﹏, CJK-Compatibility-Forms, 32/32
  160. U+FE50|65104|﹐..U+FE6F|65135|﹯, Small-Form-Variants, 32/26
  161. U+FE70|65136|ﹰ..U+FEFF|65279|, Arabic-Presentation-Forms-B, 144/141
  162. U+FF00|65280|＀..U+FFEF|65519|￯, Halfwidth-and-Fullwidth-Forms, 240/225
  163. U+FFF0|65520|￰..U+FFFF|65535|￿, Specials, 16/5
  164. U+10000|65536|𐀀..U+1007F|65663|𐁿, Linear-B-Syllabary, 128/88
  165. U+10080|65664|𐂀..U+100FF|65791|𐃿, Linear-B-Ideograms, 128/123
  166. U+10100|65792|𐄀..U+1013F|65855|𐄿, Aegean-Numbers, 64/57
  167. U+10140|65856|𐅀..U+1018F|65935|𐆏, Ancient-Greek-Numbers, 80/79
  168. U+10190|65936|𐆐..U+101CF|65999|𐇏, Ancient-Symbols, 64/13
  169. U+101D0|66000|𐇐..U+101FF|66047|𐇿, Phaistos-Disc, 48/46
  170. U+10280|66176|𐊀..U+1029F|66207|𐊟, Lycian, 32/29
  171. U+102A0|66208|𐊠..U+102DF|66271|𐋟, Carian, 64/49
  172. U+102E0|66272|𐋠..U+102FF|66303|𐋿, Coptic-Epact-Numbers, 32/28
  173. U+10300|66304|𐌀..U+1032F|66351|𐌯, Old-Italic, 48/39
  174. U+10330|66352|𐌰..U+1034F|66383|𐍏, Gothic, 32/27
  175. U+10350|66384|𐍐..U+1037F|66431|𐍿, Old-Permic, 48/43
  176. U+10380|66432|𐎀..U+1039F|66463|𐎟, Ugaritic, 32/31
  177. U+103A0|66464|𐎠..U+103DF|66527|𐏟, Old-Persian, 64/50
  178. U+10400|66560|𐐀..U+1044F|66639|𐑏, Deseret, 80/80
  179. U+10450|66640|𐑐..U+1047F|66687|𐑿, Shavian, 48/48
  180. U+10480|66688|𐒀..U+104AF|66735|𐒯, Osmanya, 48/40
  181. U+104B0|66736|𐒰..U+104FF|66815|𐓿, Osage, 80/72
  182. U+10500|66816|𐔀..U+1052F|66863|𐔯, Elbasan, 48/40
  183. U+10530|66864|𐔰..U+1056F|66927|𐕯, Caucasian-Albanian, 64/53
  184. U+10600|67072|𐘀..U+1077F|67455|𐝿, Linear-A, 384/341
  185. U+10800|67584|𐠀..U+1083F|67647|𐠿, Cypriot-Syllabary, 64/55
  186. U+10840|67648|𐡀..U+1085F|67679|𐡟, Imperial-Aramaic, 32/31
  187. U+10860|67680|𐡠..U+1087F|67711|𐡿, Palmyrene, 32/32
  188. U+10880|67712|𐢀..U+108AF|67759|𐢯, Nabataean, 48/40
  189. U+108E0|67808|𐣠..U+108FF|67839|𐣿, Hatran, 32/26
  190. U+10900|67840|𐤀..U+1091F|67871|𐤟, Phoenician, 32/29
  191. U+10920|67872|𐤠..U+1093F|67903|𐤿, Lydian, 32/27
  192. U+10980|67968|𐦀..U+1099F|67999|𐦟, Meroitic-Hieroglyphs, 32/32
  193. U+109A0|68000|𐦠..U+109FF|68095|𐧿, Meroitic-Cursive, 96/90
  194. U+10A00|68096|𐨀..U+10A5F|68191|𐩟, Kharoshthi, 96/68
  195. U+10A60|68192|𐩠..U+10A7F|68223|𐩿, Old-South-Arabian, 32/32
  196. U+10A80|68224|𐪀..U+10A9F|68255|𐪟, Old-North-Arabian, 32/32
  197. U+10AC0|68288|𐫀..U+10AFF|68351|𐫿, Manichaean, 64/51
  198. U+10B00|68352|𐬀..U+10B3F|68415|𐬿, Avestan, 64/61
  199. U+10B40|68416|𐭀..U+10B5F|68447|𐭟, Inscriptional-Parthian, 32/30
  200. U+10B60|68448|𐭠..U+10B7F|68479|𐭿, Inscriptional-Pahlavi, 32/27
  201. U+10B80|68480|𐮀..U+10BAF|68527|𐮯, Psalter-Pahlavi, 48/29
  202. U+10C00|68608|𐰀..U+10C4F|68687|𐱏, Old-Turkic, 80/73
  203. U+10C80|68736|𐲀..U+10CFF|68863|𐳿, Old-Hungarian, 128/108
  204. U+10D00|68864|𐴀..U+10D3F|68927|𐴿, Hanifi-Rohingya, 64/50
  205. U+10E60|69216|𐹠..U+10E7F|69247|𐹿, Rumi-Numeral-Symbols, 32/31
  206. U+10F00|69376|𐼀..U+10F2F|69423|𐼯, Old-Sogdian, 48/40
  207. U+10F30|69424|𐼰..U+10F6F|69487|𐽯, Sogdian, 64/42
  208. U+11000|69632|𑀀..U+1107F|69759|𑁿, Brahmi, 128/109
  209. U+11080|69760|𑂀..U+110CF|69839|𑃏, Kaithi, 80/67
  210. U+110D0|69840|𑃐..U+110FF|69887|𑃿, Sora-Sompeng, 48/35
  211. U+11100|69888|𑄀..U+1114F|69967|𑅏, Chakma, 80/70
  212. U+11150|69968|𑅐..U+1117F|70015|𑅿, Mahajani, 48/39
  213. U+11180|70016|𑆀..U+111DF|70111|𑇟, Sharada, 96/94
  214. U+111E0|70112|𑇠..U+111FF|70143|𑇿, Sinhala-Archaic-Numbers, 32/20
  215. U+11200|70144|𑈀..U+1124F|70223|𑉏, Khojki, 80/62
  216. U+11280|70272|𑊀..U+112AF|70319|𑊯, Multani, 48/38
  217. U+112B0|70320|𑊰..U+112FF|70399|𑋿, Khudawadi, 80/69
  218. U+11300|70400|𑌀..U+1137F|70527|𑍿, Grantha, 128/86
  219. U+11400|70656|𑐀..U+1147F|70783|𑑿, Newa, 128/93
  220. U+11480|70784|𑒀..U+114DF|70879|𑓟, Tirhuta, 96/82
  221. U+11580|71040|𑖀..U+115FF|71167|𑗿, Siddham, 128/92
  222. U+11600|71168|𑘀..U+1165F|71263|𑙟, Modi, 96/79
  223. (U+11660|71264|𑙠..U+1167F|71295|𑙿, Mongolian-Supplement, 32/13)
  224. U+11680|71296|𑚀..U+116CF|71375|𑛏, Takri, 80/66
  225. U+11700|71424|𑜀..U+1173F|71487|𑜿, Ahom, 64/58
  226. U+11800|71680|𑠀..U+1184F|71759|𑡏, Dogra, 80/60
  227. U+118A0|71840|𑢠..U+118FF|71935|𑣿, Warang-Citi, 96/84
  228. U+11A00|72192|𑨀..U+11A4F|72271|𑩏, Zanabazar-Square, 80/72
  229. U+11A50|72272|𑩐..U+11AAF|72367|𑪯, Soyombo, 96/81
  230. U+11AC0|72384|𑫀..U+11AFF|72447|𑫿, Pau-Cin-Hau, 64/57
  231. U+11C00|72704|𑰀..U+11C6F|72815|𑱯, Bhaiksuki, 112/97
  232. U+11C70|72816|𑱰..U+11CBF|72895|𑲿, Marchen, 80/68
  233. U+11D00|72960|𑴀..U+11D5F|73055|𑵟, Masaram-Gondi, 96/75
  234. U+11D60|73056|𑵠..U+11DAF|73135|𑶯, Gunjala-Gondi, 80/63
  235. U+11EE0|73440|𑻠..U+11EFF|73471|𑻿, Makasar, 32/25
  236. U+12000|73728|𒀀..U+123FF|74751|𒏿, Cuneiform, 1024/922
  237. U+12400|74752|𒐀..U+1247F|74879|𒑿, Cuneiform-Numbers-and-Punctuation, 128/116
  238. U+12480|74880|𒒀..U+1254F|75087|𒕏, Early-Dynastic-Cuneiform, 208/196
  239. U+13000|77824|𓀀..U+1342F|78895|𓐯, Egyptian-Hieroglyphs, 1072/1071
  240. U+14400|82944|𔐀..U+1467F|83583|𔙿, Anatolian-Hieroglyphs, 640/583
  241. U+16800|92160|𖠀..U+16A3F|92735|𖨿, Bamum-Supplement, 576/569
  242. U+16A40|92736|𖩀..U+16A6F|92783|𖩯, Mro, 48/43
  243. U+16AD0|92880|𖫐..U+16AFF|92927|𖫿, Bassa-Vah, 48/36
  244. U+16B00|92928|𖬀..U+16B8F|93071|𖮏, Pahawh-Hmong, 144/127
  245. U+16E40|93760|𖹀..U+16E9F|93855|𖺟, Medefaidrin, 96/91
  246. U+16F00|93952|𖼀..U+16F9F|94111|𖾟, Miao, 160/133
  247. U+16FE0|94176|𖿠..U+16FFF|94207|𖿿, Ideographic-Symbols-and-Punctuation, 32/2
  248. (U+17000|94208|𗀀..U+187FF|100351|𘟿, Tangut, 6144/6130)
  249. U+18800|100352|𘠀..U+18AFF|101119|𘫿, Tangut-Components, 768/755
  250. U+1B000|110592|𛀀..U+1B0FF|110847|𛃿, Kana-Supplement, 256/256
  251. U+1B100|110848|𛄀..U+1B12F|110895|𛄯, Kana-Extended-A, 48/31
  252. U+1B170|110960|𛅰..U+1B2FF|111359|𛋿, Nushu, 400/396
  253. U+1BC00|113664|𛰀..U+1BC9F|113823|𛲟, Duployan, 160/143
  254. U+1BCA0|113824|𛲠..U+1BCAF|113839|𛲯, Shorthand-Format-Controls, 16/4
  255. U+1D000|118784|𝀀..U+1D0FF|119039|𝃿, Byzantine-Musical-Symbols, 256/246
  256. U+1D100|119040|𝄀..U+1D1FF|119295|𝇿, Musical-Symbols, 256/231
  257. U+1D200|119296|𝈀..U+1D24F|119375|𝉏, Ancient-Greek-Musical-Notation, 80/70
  258. U+1D2E0|119520|𝋠..U+1D2FF|119551|𝋿, Mayan-Numerals, 32/20
  259. U+1D300|119552|𝌀..U+1D35F|119647|𝍟, Tai-Xuan-Jing-Symbols, 96/87
  260. U+1D360|119648|𝍠..U+1D37F|119679|𝍿, Counting-Rod-Numerals, 32/25
  261. U+1D400|119808|𝐀..U+1D7FF|120831|𝟿, Mathematical-Alphanumeric-Symbols, 1024/996
  262. U+1D800|120832|𝠀..U+1DAAF|121519|𝪯, Sutton-SignWriting, 688/672
  263. U+1E000|122880|𞀀..U+1E02F|122927|𞀯, Glagolitic-Supplement, 48/38
  264. U+1E800|124928|𞠀..U+1E8DF|125151|𞣟, Mende-Kikakui, 224/213
  265. U+1E900|125184|𞤀..U+1E95F|125279|𞥟, Adlam, 96/87
  266. U+1EC70|126064|𞱰..U+1ECBF|126143|𞲿, Indic-Siyaq-Numbers, 80/68
  267. U+1EE00|126464|𞸀..U+1EEFF|126719|𞻿, Arabic-Mathematical-Alphabetic-Symbols, 256/143
  268. U+1F000|126976|🀀..U+1F02F|127023|🀯, Mahjong-Tiles, 48/44
  269. U+1F030|127024|🀰..U+1F09F|127135|🂟, Domino-Tiles, 112/100
  270. U+1F0A0|127136|🂠..U+1F0FF|127231|🃿, Playing-Cards, 96/82
  271. U+1F100|127232|🄀..U+1F1FF|127487|🇿, Enclosed-Alphanumeric-Supplement, 256/192
  272. U+1F200|127488|🈀..U+1F2FF|127743|🋿, Enclosed-Ideographic-Supplement, 256/64
  273. U+1F300|127744|🌀..U+1F5FF|128511|🗿, Miscellaneous-Symbols-and-Pictographs, 768/768
  274. U+1F600|128512|😀..U+1F64F|128591|🙏, Emoticons, 80/80
  275. U+1F650|128592|🙐..U+1F67F|128639|🙿, Ornamental-Dingbats, 48/48
  276. U+1F680|128640|🚀..U+1F6FF|128767|🛿, Transport-and-Map-Symbols, 128/108
  277. U+1F700|128768|🜀..U+1F77F|128895|🝿, Alchemical-Symbols, 128/116
  278. U+1F780|128896|🞀..U+1F7FF|129023|🟿, Geometric-Shapes-Extended, 128/89
  279. U+1F800|129024|🠀..U+1F8FF|129279|🣿, Supplemental-Arrows-C, 256/148
  280. U+1F900|129280|🤀..U+1F9FF|129535|🧿, Supplemental-Symbols-and-Pictographs, 256/213
  281. U+1FA00|129536|🨀..U+1FA6F|129647|🩯, Chess-Symbols, 112/14
  282. (U+20000|131072|𠀀..U+2A6DF|173791|𪛟, CJK-Unified-Ideographs-Extension-B, 42720/42711)
  283. (U+2A700|173824|𪜀..U+2B73F|177983|𫜿, CJK-Unified-Ideographs-Extension-C, 4160/4149)
  284. (U+2B740|177984|𫝀..U+2B81F|178207|𫠟, CJK-Unified-Ideographs-Extension-D, 224/222)
  285. (U+2B820|178208|𫠠..U+2CEAF|183983|𬺯, CJK-Unified-Ideographs-Extension-E, 5776/5762)
  286. (U+2CEB0|183984|𬺰..U+2EBEF|191471|𮯯, CJK-Unified-Ideographs-Extension-F, 7488/7473)
  287. U+2F800|194560|丽..U+2FA1F|195103|𯨟, CJK-Compatibility-Ideographs-Supplement, 544/542
  288. U+E0000|917504..U+E007F|917631, Tags, 128/97
  289. U+E0100|917760..U+E01EF|917999, Variation-Selectors-Supplement, 240/240
  290. (U+F0000|983040..U+FFFFF|1048575, Supplementary-Private-Use-Area-A, 65536/0)
  291. (U+100000|1048576..U+10FFFF|1114111, Supplementary-Private-Use-Area-B, 65536/0)

code-point.USED

description::
· used-code-point of Unicode is a-code-point with an assigned char, or reserved for surrogates, internal-use or private-use.
[hmnSgm.2018-06-26]

name::
* cpt.char.Unicode'code-point.used,
* cpt.Unicode'code-point.used,
* cpt.Unicode'char'code-point.used,
* cpt.used-code-point--of-Unicode,

code-point.ASSIGNED

description::
· assigned--Unicode-code-point is a-used-code-point with an assigned char.
[hmnSgm.2018-06-26]

name::
* cpt.assigned--Unicode-code-point,
* cpt.char.Unicode'code-point.assigned,
* cpt.Unicode'code-point.assigned,

code-point.PRIVATE-USE

description::
Private-use characters are code points whose interpretation is not specified by a character encoding standard and whose use and interpretation may be determined by private agreement among cooperating users. Private-use characters are sometimes also referred to as user-defined characters (UDC) or vendor-defined characters (VDC).
[http://www.unicode.org/faq/private_use.html#pua1]

name::
* cpt.char.Unicode'code-point.private-use-character,
* cpt.private-use-character--Unicode-code-point,
* cpt.Unicode'code-point.private-use-character,
* cpt.Unicode'code-point.user-defined-character,
* cpt.Unicode'code-point.vendor-defined-character,

code-point.INTERNAL-USE

description::
A "noncharacter" is a code point that is permanently reserved in the Unicode Standard for internal use.
[http://www.unicode.org/faq/private_use.html#noncharacters]

name::
* cpt.char.Unicode'code-point.internal-use,
* cpt.noncharacter--Unicode-code-point,
* cpt.Unicode'code-point.internal-use,
* cpt.Unicode'code-point.noncharacter,

code-point.USED.NO

description::
· usedNo-code-point of Unicode is a-code-point with NOT an assigned char, or reserved for surrogates, internal-use or private-use.
[hmnSgm.2018-06-26]

name::
* cpt.char.Unicode'code-point.usedNo,
* cpt.Unicode'code-point.usedNo,
* cpt.Unicode'char'code-point.usedNo,
* cpt.usedNo-code-point--of-Unicode,

code-point.UNKNOWN-SCRIPT

description::
The value of "unknown" script (ISO 15924 code Zzzz) is given to unassigned, private use, noncharacter, and surrogate code points.
[https://en.wikipedia.org/wiki/Script_(Unicode)#Special_script_property_values {2018-06-26}]

name::
* cpt.unknown-script-code-point--of-Unicode,
* cpt.unknown-script--Unicode-code-point,
* cpt.char.Unicode'code-point.unknown-script,
* cpt.Unicode'code-point.unknown-script,

code-point.UNKNOWN-SCRIPT.NO

description::
· known-script--code-points are Unicode-code-points with assigned characters[a] which[a] are part of a-script.
[hmnSgm.2018-06-26]

name::
* cpt.known-script-code-point--of-Unicode,
* cpt.unknown-scriptNo-code-point--of-Unicode,
* cpt.unknown-scriptNo--Unicode-code-point,
* cpt.char.Unicode'code-point.unknown-scriptNo,
* cpt.Unicode'code-point.unknown-scriptNo,

block of Unicode-char (link)

plane of Unicode-char (link)

Age-property of Unicode-char

description::
The Age property indicates the first version in which a particular Unicode character was assigned.
For example, U+20AC € EURO SIGN was added to Version 2.1 of the Unicode Standard, so it has age=2.1, while U+20B9 ₹ INDIAN RUPEE SIGN was added to Version 6.0 of the Unicode Standard, so it has age=6.0.
[https://www.unicode.org/reports/tr44/tr44-22.html#Character_Age]

name::
* cpt.char.Unicode'Age-property,
* cpt.Unicode'Age-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedAge.txt,

Alphabetic-property of Unicode-char

description::
Derived Property: Alphabetic
Generated from: Uppercase + Lowercase + Lt + Lm + Lo + Nl + Other_Alphabetic
[ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt]

name::
* cpt.char.Unicode'Alphabetic-property,
* cpt.Unicode'Alphabetic-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt,

Bidi_Class-property of Unicode-char

description::
determine the directionality for bidirectional Unicode text.
[https://www.unicode.org/reports/tr9/tr9-39.html]

name::
* cpt.char.Unicode'Bidi_Class-property,
* cpt.Unicode'Bidi_Class-property,

addressWpg::
* Unicode® Standard Annex #9, UNICODE BIDIRECTIONAL ALGORITHM, http://www.unicode.org/reports/tr9/,
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt,

specific::
* L-Left_To_Right,
* R-Right_To_Left,
* EN-European_Number,
* ES-European_Separator,
* ET-European_Terminator,
* AN-Arabic_Number,
* CS-Common_Separator,
* B-Paragraph_Separator,
* S-Segment_Separator,
* WS-White_Space,
* ON-Other_Neutral,
* BN-Boundary_Neutral,
* NSM-Nonspacing_Mark,
* AL-Arabic_Letter,
* LRO-Left_To_Right_Override,
* RLO-Right_To_Left_Override,
* LRE-Left_To_Right_Embedding,
* RLE-Right_To_Left_Embedding,
* PDF-Pop_Directional_Format,
* LRI-Left_To_Right_Isolate,
* RLI-Right_To_Left_Isolate,
* FSI-First_Strong_Isolate,
* PDI-Pop_Directional_Isolate,

Bidi_Mirrored-property of Unicode-char

description::
If the character is a "mirrored" character in bidirectional text, this field has the value "Y"; otherwise "N". See Section 4.7, Bidi Mirrored of [Unicode].
Do not confuse this with the Bidi_Mirroring_Glyph property.
[https://www.unicode.org/reports/tr44/#Bidi_Mirrored]

name::
* cpt.char.Unicode'Bidi_Mirrored-property,
* cpt.Unicode'Bidi_Mirrored-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBinaryProperties.txt,

Bidi_Mirroring_Glyph-property of Unicode-char

description::
Informative mapping for substituting characters in an implementation of bidirectional mirroring. This maps a subset of characters with the Bidi_Mirrored property to other characters that normally are displayed with the corresponding mirrored glyph. When a character with the Bidi_Mirrored property has the default value for Bidi_Mirroring_Glyph, that means that no other character exists whose glyph is appropriate for character-based glyph mirroring. Implementations must then use other mechanisms to implement mirroring of those characters for the Unicode Bidirectional Algorithm. See Unicode Standard Annex #9, "Unicode Bidirectional Algorithm" [UAX9]. Do not confuse this property with the Bidi_Mirrored property itself.
[https://www.unicode.org/reports/tr44/tr44-22.html#Bidi_Mirroring_Glyph]

name::
* cpt.char.Unicode'Bidi_Mirroring_Glyph-property,
* cpt.Unicode'Bidi_Mirroring_Glyph-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/BidiMirroring.txt,

Case-property of Unicode-char

description::
Case is a normative property of characters in certain alphabets whereby characters are considered to be variants of a single letter. These variants, which may differ markedly in shapeand size, are called the uppercase letter (also known as capital or majuscule) and the lowercase letter (also known as small or minuscule). The uppercase letter is generally larger than the lowercase letter.
Because of the inclusion of certain composite characters for compatibility, such as U+01F1 latin capital letter dz, a third case, called titlecase, is used where the first character of a word must be capitalized. An example of such a character is U+01F2 latin capital letter d with small letter z. The three case forms are UPPERCASE, Titlecase, and lowercase.
For those scripts that have case (Latin, Greek, Coptic, Cyrillic, Glagolitic, Armenian, archaic Georgian, Deseret, and Warang Citi), uppercase characters typically contain the word capital in their names. Lowercase characters typically contain the word small. However, this is not a reliable guide. The word small in the names of characters from scripts other than those just listed has nothing to do with case. There are other exceptions as well, such as small capital letters that are not formally uppercase. Some Greek characters with capital in their names are actually titlecase. (Note that while the archaic Georgian script contained upper- and lowercase pairs, they are not used in modern Georgian. See Section 7.7, Georgian.)
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 4.2]

name::
* cpt.char.Unicode'Case-property,
* cpt.Unicode'Case-property,

specific::
* Lowercase,
* Titlecase,
* Uppercase,

Lowercase-property of Unicode-char

description::
Derived Property: Lowercase
Generated from: Ll + Other_Lowercase
[ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt]

name::
* cpt.char.Unicode'Lowercase-property,
* cpt.Unicode'Lowercase-property,

Uppercase-property of Unicode-char

description::
Derived Property: Uppercase
Generated from: Lu + Other_Uppercase
[ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt]

name::
* cpt.char.Unicode'Uppercase-property,
* cpt.Unicode'Uppercase-property,

Titlecase-property of Unicode-char

description::
Because of the inclusion of certain composite characters for compatibility, such as U+01F1 latin capital letter dz, a third case, called titlecase, is used where the first character of a word must be capitalized. An example of such a character is U+01F2 latin capital letter d with small letter z. The three case forms are UPPERCASE, Titlecase, and lowercase.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 4.2]

name::
* cpt.char.Unicode'Titlecase-property,
* cpt.Unicode'Titlecase-property,

Cased-property of Unicode-char

description::
D135 A character C is defined to be cased if and only if C has the Lowercase or Uppercase property or has a General_Category value of Titlecase_Letter.
• The Uppercase and Lowercase property values are specified in the data file DerivedCoreProperties.txt in the Unicode Character Database. The derived property Cased is also listed in DerivedCoreProperties.txt.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf]

name::
* cpt.char.Unicode'Cased-property,
* cpt.Unicode'Cased-property,

Case_Folding-property of Unicode-char

description::
Mapping from characters to their case-folded forms. This is an informative file containing normative derived properties.
Derived from UnicodeData and SpecialCasing.
Note: The case foldings are omitted in the data file if they are the same as the code point itself.
[https://www.unicode.org/reports/tr44/tr44-22.html#CaseFolding.txt]

name::
* cpt.char.Unicode'Case_Folding-property,
* cpt.Unicode'Case_Folding-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt,

specific::
* C: common case folding, common mappings shared by both simple and full mappings.
* F: full case folding, mappings that cause strings to grow in length. Multiple characters are separated by spaces.
* S: simple case folding, mappings to single characters where different from F.
* T: special case for uppercase I and dotted uppercase I
- For non-Turkic languages, this mapping is normally not used.
- For Turkic languages (tr, az), this mapping can be used instead of the normal mapping for these characters. Note that the Turkic mappings do not maintain canonical equivalence without additional processing. See the discussions of case mapping in the Unicode Standard for more information.
[ftp://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt]

Bidi_Paired_Bracket-property of Unicode-char

description::
For an opening bracket, the code point of the matching closing bracket. For a closing bracket, the code point of the matching opening bracket. This property is used in the implementation of parenthesis matching. See Unicode Standard Annex #9, "Unicode Bidirectional Algorithm" [UAX9].
[https://www.unicode.org/reports/tr44/tr44-22.html#Bidi_Paired_Bracket]

name::
* cpt.char.Unicode'Bidi_Paired_Bracket-property,
* cpt.Unicode'Bidi_Paired_Bracket-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/BidiBrackets.txt,
* https://www.unicode.org/reports/tr9/,

Bidi_Paired_Bracket_Type-property of Unicode-char

description::
Type of a paired bracket, either opening or closing. This property is used in the implementation of parenthesis matching. See Unicode Standard Annex #9, "Unicode Bidirectional Algorithm" [UAX9].
[https://www.unicode.org/reports/tr44/tr44-22.html#Bidi_Paired_Bracket_Type]

name::
* cpt.char.Unicode'Bidi_Paired_Bracket_Type-property,
* cpt.Unicode'Bidi_Paired_Bracket_Type-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/BidiBrackets.txt,
* https://www.unicode.org/reports/tr9/,

specific::
* o-Open,
* c-Close,
* n-None,

Decomposition_Mapping-property of Unicode-char

description::
This is a string property, consisting of a sequence of one or more Unicode code points.
... The prefixed tags supplied with a subset of the decomposition mappings generally indicate formatting information.
[https://www.unicode.org/reports/tr44/#Character_Decomposition_Mappings]

name::
* cpt.char.Unicode'Decomposition_Mapping-property,
* cpt.Unicode'Decomposition_Mapping-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedDecompositionType.txt,

specific::
* <font>: Font variant (for example, a blackletter form)
* <noBreak>: No-break version of a space or hyphen
* <initial>: Initial presentation form (Arabic)
* <medial>: Medial presentation form (Arabic)
* <final>: Final presentation form (Arabic)
* <isolated>: Isolated presentation form (Arabic)
* <circle>: Encircled form
* <super>: Superscript form
* <sub>: Subscript form
* <vertical>: Vertical layout presentation form
* <wide>: Wide (or zenkaku) compatibility character
* <narrow>: Narrow (or hankaku) compatibility character
* <small>: Small variant form (CNS compatibility)
* <square>: CJK squared font variant
* <fraction>: Vulgar fraction form
* <compat>: Otherwise unspecified compatibility character
[https://www.unicode.org/reports/tr44/#Formatting_Tags_Table]

East_Asian_Width-property of Unicode-char

description::
When dealing with East Asian text, there is the concept of an inherent width of a character. This width takes on either of two values: narrow or wide. For traditional mixed-width East Asian legacy character sets, this classification into narrow and wide corresponds with few exceptions directly to the storage size for each character: a few narrow characters use a single byte per character and all other characters (usually wide) use two or more bytes.
[https://www.unicode.org/reports/tr11/tr11-35.html#Overview]

name::
* cpt.char.Unicode'East_Asian_Width-property,
* cpt.Unicode'East_Asian_Width-property,

addressWpg::
* https://www.unicode.org/reports/tr11/,
* ftp://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt,
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedEastAsianWidth.txt,

specific::
* A-Ambiguous: can be sometimes wide and sometimes narrow,
* F-Fullwidth,
* H-Halfwidth,
* N-Neutral (= Not East Asian),
* Na-Narrow,
* W-Wide: always wide,

General_Category-property of Unicode-char

description::
The General_Category property of a code point provides for the most general classification of that code point.
It is usually determined based on the primary characteristic of the assigned character for that code point. For example, is the character a letter, a mark, a number, punctuation, or a symbol, and if so, of what type?
Other General_Category values define the classification of code points which are not assigned to regular graphic characters, including such statuses as private-use, control, surrogate code point, and reserved unassigned.
[http://www.unicode.org/reports/tr44/#General_Category_Values]

name::
* cpt.char.Unicode'General_Category-property,
* cpt.Unicode'General_Category-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedGeneralCategory.txt,

specific::
[http://www.unicode.org/reports/tr44/#GC_Values_Table]

AbbrLongDescription
LuUppercase_Letteran uppercase letter
LlLowercase_Lettera lowercase letter
LtTitlecase_Lettera digraphic character, with first part uppercase
LCCased_LetterLu | Ll | Lt
LmModifier_Lettera modifier letter
LoOther_Letterother letters, including syllables and ideographs
LLetterLu | Ll | Lt | Lm | Lo
MnNonspacing_Marka nonspacing combining mark (zero advance width)
McSpacing_Marka spacing combining mark (positive advance width)
MeEnclosing_Markan enclosing combining mark
MMarkMn | Mc | Me
NdDecimal_Numbera decimal digit
NlLetter_Numbera letterlike numeric character
NoOther_Numbera numeric character of other type
NNumberNd | Nl | No
PcConnector_Punctuationa connecting punctuation mark, like a tie
PdDash_Punctuationa dash or hyphen punctuation mark
PsOpen_Punctuationan opening punctuation mark (of a pair)
PeClose_Punctuationa closing punctuation mark (of a pair)
PiInitial_Punctuationan initial quotation mark
PfFinal_Punctuationa final quotation mark
PoOther_Punctuationa punctuation mark of other type
PPunctuationPc | Pd | Ps | Pe | Pi | Pf | Po
SmMath_Symbola symbol of mathematical use
ScCurrency_Symbola currency sign
SkModifier_Symbola non-letterlike modifier symbol
SoOther_Symbola symbol of other type
SSymbolSm | Sc | Sk | So
ZsSpace_Separatora space character (of various non-zero widths)
ZlLine_SeparatorU+2028 LINE SEPARATOR only
ZpParagraph_SeparatorU+2029 PARAGRAPH SEPARATOR only
ZSeparatorZs | Zl | Zp
CcControla C0 or C1 control code
CfFormata format control character
CsSurrogatea surrogate code point
CoPrivate_Usea private-use character
CnUnassigneda reserved unassigned code point or a noncharacter
COtherCc | Cf | Cs | Co | Cn

Joining_Group-property of Unicode-char

description::
The Arabic characters with the property values Joining_Type=Dual_Joining and Joining_Type=Right_Joining can each be subdivided into shaping groups, based on the behavior of their letter skeletons when shaped in context.
The Unicode character property that specifies these groups is called Joining_Group.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf]
===
All code points not explicitly listed for Joining_Group have the value No_Joining_Group.
[ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedJoiningGroup.txt]

name::
* cpt.char.Unicode'Joining_Group-property,
* cpt.Unicode'Joining_Group-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedJoiningGroup.txt,

specific::
* African_Feh,
* African_Noon,
* African_Qaf,
* Ain,
* Alaph,
* Alef,
* Beh,
* Beth,
* Burushaski_Yeh_Barree,
* Dal,
* Dalath_Rish,
* E,
* Farsi_Yeh,
* Fe,
* Feh,
* Final_Semkath,
* Gaf,
* Gamal,
* Hah,
* Hanifi_Rohingya_Kinna_Ya,
* Hanifi_Rohingya_Pa,
* He,
* Heh,
* Heh_Goal,
* Heth,
* Kaf,
* Kaph,
* Khaph,
* Knotted_Heh,
* Lam,
* Lamadh,
* Malayalam_Bha,
* Malayalam_Ja,
* Malayalam_Lla,
* Malayalam_Llla,
* Malayalam_Nga,
* Malayalam_Nna,
* Malayalam_Nnna,
* Malayalam_Nya,
* Malayalam_Ra,
* Malayalam_Ssa,
* Malayalam_Tta,
* Manichaean_Aleph,
* Manichaean_Ayin,
* Manichaean_Beth,
* Manichaean_Daleth,
* Manichaean_Dhamedh,
* Manichaean_Five,
* Manichaean_Gimel,
* Manichaean_Heth,
* Manichaean_Hundred,
* Manichaean_Kaph,
* Manichaean_Lamedh,
* Manichaean_Mem,
* Manichaean_Nun,
* Manichaean_One,
* Manichaean_Pe,
* Manichaean_Qoph,
* Manichaean_Resh,
* Manichaean_Sadhe,
* Manichaean_Samekh,
* Manichaean_Taw,
* Manichaean_Ten,
* Manichaean_Teth,
* Manichaean_Thamedh,
* Manichaean_Twenty,
* Manichaean_Waw,
* Manichaean_Yodh,
* Manichaean_Zayin,
* Meem,
* Mim,
* Noon,
* Nun,
* Nya,
* Pe,
* Qaf,
* Qaph,
* Reh,
* Reversed_Pe,
* Rohingya_Yeh,
* Sad,
* Sadhe,
* Seen,
* Semkath,
* Shin,
* Straight_Waw,
* Swash_Kaf,
* Syriac_Waw,
* Tah,
* Taw,
* Teh_Marbuta,
* Teh_Marbuta_Goal,
* Teth,
* Waw,
* Yeh,
* Yeh_Barree,
* Yeh_With_Tail,
* Yudh,
* Yudh_He,
* Zain,
* Zhain,

Joining_Type-property of Unicode-char

description::
Each Arabic letter must be depicted by one of a number of possible contextual glyph forms.
The appropriate form is determined on the basis of the cursive joining behavior of that character as it interacts with the cursive joining behavior of adjacent characters.
In the Unicode Standard, such cursive joining behavior is formally described in terms of values of a character property called Joining_Type. [http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf]

All code points not explicitly listed for Joining_Type have the value Non_Joining (U).
[ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedJoiningType.txt]

name::
* cpt.char.Unicode'Joining_Type-property,
* cpt.Unicode'Joining_Type-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedJoiningType.txt,

specific::
* R-Right_Joining
* L-Left_Joining
* D-Dual_Joining
* C-Join_Causing
* U-Non_Joining
* T-Transparent,

Line_Break-property of Unicode-char

description::
Line breaking, also known as word wrapping, is the process of breaking a section of text into lines such that it will fit in the available width of a page, window or other display area.
The Unicode Line Breaking Algorithm performs part of this process.
Given an input text, it produces a set of positions called "break opportunities" that are appropriate points to begin a new line.
The selection of actual line break positions from the set of break opportunities is not covered by the Unicode Line Breaking Algorithm, but is in the domain of higher level software with knowledge of the available width and the display size of the text.
[https://www.unicode.org/reports/tr14/tr14-41.html#Scope]

name::
* cpt.char.Unicode'Line_Break-property,
* cpt.Unicode'Line_Break-property,

addressWpg::
* https://www.unicode.org/reports/tr14/,
* ftp://www.unicode.org/Public/UCD/latest/ucd/LineBreak.txt,
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedLineBreak.txt,

specific::
Class Descriptive Name Examples Behavior
Non-tailorable Line Breaking Classes
BK Mandatory Break NL, PARAGRAPH SEPARATOR Cause a line break (after)
CR Carriage Return CR Cause a line break (after), except between CR and LF
LF Line Feed LF Cause a line break (after)
CM Combining Mark Combining marks, control codes Prohibit a line break between the character and the preceding character
NL Next Line NEL Cause a line break (after)
SG Surrogate Surrogates Do not occur in well-formed text
WJ Word Joiner WJ Prohibit line breaks before and after
ZW Zero Width Space ZWSP Provide a break opportunity
GL Non-breaking (“Glue”) CGJ, NBSP, ZWNBSP Prohibit line breaks before and after
SP Space SPACE Enable indirect line breaks
ZWJ Zero Width Joiner Zero Width Joiner Prohibit line breaks within joiner sequences
Break Opportunities
B2 Break Opportunity Before and After Em dash Provide a line break opportunity before and after the character
BA Break After Spaces, hyphens Generally provide a line break opportunity after the character
BB Break Before Punctuation used in dictionaries Generally provide a line break opportunity before the character
HY Hyphen HYPHEN-MINUS Provide a line break opportunity after the character, except in numeric context
CB Contingent Break Opportunity Inline objects Provide a line break opportunity contingent on additional information
Characters Prohibiting Certain Breaks
CL Close Punctuation “}”, “❳”, “⟫” etc. Prohibit line breaks before
CP Close Parenthesis “)”, “]” Prohibit line breaks before
EX Exclamation/
Interrogation
“!”, “?”, etc. Prohibit line breaks before
IN Inseparable Leaders Allow only indirect line breaks between pairs
NS Nonstarter “‼”, “‽”, “⁇”, “⁉”, etc. Allow only indirect line breaks before
OP Open Punctuation “(“, “[“, “{“, etc. Prohibit line breaks after
QU Quotation Quotation marks Act like they are both opening and closing
Numeric Context
IS Infix Numeric Separator . , Prevent breaks after any and before numeric
NU Numeric Digits Form numeric expressions for line breaking purposes
PO Postfix Numeric %, ¢ Do not break following a numeric expression
PR Prefix Numeric $, £, ¥, etc. Do not break in front of a numeric expression
SY Symbols Allowing Break After / Prevent a break before, and allow a break after
Other Characters
AI Ambiguous (Alphabetic or Ideographic) Characters with Ambiguous East Asian Width Act like AL when the resolved EAW is N; otherwise, act as ID
AL Alphabetic Alphabets and regular symbols Are alphabetic characters or symbols that are used with alphabetic characters
CJ Conditional Japanese Starter Small kana Treat as NS or ID for strict or normal breaking.
EB Emoji Base All emoji allowing modifiers Do not break from following Emoji Modifier
EM Emoji Modifier Skin tone modifiers Do not break from preceding Emoji Base
H2 Hangul LV Syllable Hangul Form Korean syllable blocks
H3 Hangul LVT Syllable Hangul Form Korean syllable blocks
HL Hebrew Letter Hebrew Do not break around a following hyphen; otherwise act as Alphabetic
ID Ideographic Ideographs Break before or after, except in some numeric context
JL Hangul L Jamo Conjoining jamo Form Korean syllable blocks
JV Hangul V Jamo Conjoining jamo Form Korean syllable blocks
JT Hangul T Jamo Conjoining jamo Form Korean syllable blocks
RI Regional Indicator REGIONAL INDICATOR SYMBOL LETTER A .. Z Keep pairs together. For pairs, break before and after other classes
SA Complex Context Dependent (South East Asian) South East Asian: Thai, Lao, Khmer Provide a line break opportunity contingent on additional, language-specific context analysis
XX Unknown Most unassigned, private-use Have as yet unknown line breaking behavior or unassigned code positions

[https://www.unicode.org/reports/tr14/tr14-41.html#Table1]

Math-property of Unicode-char

description::
Derived Property: Math
Generated from: Sm + Other_Math
[ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt]

name::
* cpt.char.Unicode'Math-property,
* cpt.Unicode'Math-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt,

Numeric_Type-property of Unicode-char

description::
Derived Property: Numeric_Type
The values are based on fields 6-8 of UnicodeData.txt, plus the fields kAccountingNumeric, kOtherNumeric, kPrimaryNumeric in the Unicode Han Database (Unihan).
[ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedNumericType.txt]

name::
* cpt.char.Unicode'Numeric_Type-property,
* cpt.Unicode'Numeric_Type-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedNumericType.txt,

specific::
* Decimal: When there is a value in field 6.
* Digit: When there is a value in field 7, but not in field 6.
* Numeric: When there are values for kAccountingNumeric, kOtherNumeric, kPrimaryNumeric, or there is a value in field 8, but not in field 7.
* None: Otherwise. All code points not explicitly listed for Numeric_Type have the value None.

Numeric_Value-property of Unicode-char

description::
Finally, we have three fields, kAccountingNumeric, kOtherNumeric, and kPrimaryNumeric to indicate the numerical values an ideograph may have.
Traditionally, ideographs were used both for numbers and words, and so many ideographs have (or can have) numeric values.
The various kinds of numeric values are specified by these three fields.
[https://www.unicode.org/reports/tr38/tr38-25.html#N1024D]

name::
* cpt.char.Unicode'Numeric_Value-property,
* cpt.Unicode'Numeric_Value-property,

addressWpg::
* ftp://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedNumericValues.txt,

property of Unicode-char

description::
· property of Unicode-char is-called a-generic-attribute of a-Unicode-char, with specifics, called values.

name::
* cpt.char.Unicode'property,
* cpt.Unicode'property,

addressWpg::
* http://www.unicode.org/reports/tr44/#Property_Definitions,

specific::
* normative,
* overridable,
* informative,
* contributory,
* provisional,
===
* simple,
* derived,
===
* catalog,
* enumeration,
* binary,
* string,
* numeric,
* misc,
===
* complex,
===
* Age,
* Bidi_Class,
* Block,
* Canonical_Combining_Class,
* Decomposition_Type,
* East_Asian_Width,
* General_Category,
* Line_Break,
* Numeric_Type,
* Numeric_Value,
* Script,
* Vertical_Orientation,

property.NORMATIVE of Unicode-char

description::
Normative property: A Unicode character property used in the specification of the standard.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 3.5]

name::
* cpt.char.Unicode'property.normative,
* cpt.Unicode'property.normative,

property.OVERRIDABLE of Unicode-char

description::
Overridable property: A normative property whose values may be overridden by conformant higher-level protocols.
• For example, the Canonical_Decomposition property is not overridable. The Uppercase property can be overridden.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 3.5]

name::
* cpt.char.Unicode'property.overridable,
* cpt.Unicode'property.overridable,

property.INFORMATIVE of Unicode-char

description::
Informative property: A Unicode character property whose values are provided for information only.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 3.5]

name::
* cpt.char.Unicode'property.informative,
* cpt.Unicode'property.informative,

property.CONTRIBUTORY of Unicode-char

description::
Contributory property: A simple property defined merely to make the statement of a rule defining a derived property more compact or general.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 3.5]

name::
* cpt.char.Unicode'property.contributory,
* cpt.Unicode'property.contributory,

property.PROVISIONAL of Unicode-char

description::
Provisional property: A Unicode character property whose values are unapproved and tentative, and which may be incomplete or otherwise not in a usable state.
• Provisional properties may be removed from future versions of the standard, without prior notice.
[http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf, 3.5]

name::
* cpt.char.Unicode'property.provisional,
* cpt.Unicode'property.provisional,

property.SIMPLE of Unicode-char

description::
Some character properties in the UCD are simple properties. This status has no bearing on whether or not the properties are normative, but merely indicates that their values are not derived from some combination of other properties.
[https://www.unicode.org/reports/tr44/tr44-22.html#Simple_Props]

name::
* cpt.char.Unicode'simple-property,
* cpt.char.Unicode'property.derivedNo,
* cpt.char.Unicode'property.simple,
* cpt.Unicode'property.derivedNo,
* cpt.Unicode'property.simple,

property.DERIVED of Unicode-char

description::
Other character properties are derived. This means that their values are derived by rule from some other combination of properties. Generally such rules are stated as set operations, and may or may not include explicit exception lists for individual characters.
[https://www.unicode.org/reports/tr44/tr44-22.html#Derived_Props]

name::
* cpt.char.Unicode'derived-property,
* cpt.char.Unicode'property.derived,
* cpt.Unicode'property.derived,

property.BINARY of Unicode-char

description::
Binary properties are a special case of Enumeration properties, which have exactly two values: Yes and No (or True and False).
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'binary-property,
* cpt.char.Unicode'property.binary,
* cpt.Unicode'property.binary,

property.CATALOG of Unicode-char

description::
Catalog properties have enumerated values which are expected to be regularly extended in successive versions of the Unicode Standard. This distinguishes them from Enumeration properties.
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'catalog-property,
* cpt.char.Unicode'property.catalog,
* cpt.Unicode'property.catalog,

property.ENUMERATION of Unicode-char

description::
Enumeration properties have enumerated values which constitute a logical partition space; new values will generally not be added to them in successive versions of the standard.
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'enumeration-property,
* cpt.char.Unicode'property.enumeration,
* cpt.Unicode'property.enumeration,

property.STRING of Unicode-char

description::
String properties are typically mappings from a Unicode code point to another Unicode code point or sequence of Unicode code points; examples include case mappings and decomposition mappings.
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'string-property,
* cpt.char.Unicode'property.string,
* cpt.Unicode'property.string,

property.NUMERIC of Unicode-char

description::
Numeric properties specify the actual numeric values for digits and other characters associated with numbers in some way.
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'numeric-property,
* cpt.Unicode'property.numeric,

property.MISC of Unicode-char

description::
Miscellaneous properties are those properties that do not fit neatly into the other property categories; they currently include character names, comments about characters, the Script_Extensions property, the Equivalent_Unified_Ideograph property, and the Unicode_Radical_Stroke property (a combination of numeric values) documented in Unicode Standard Annex #38, "Unicode Han Database (Unihan)" [UAX38].
[http://www.unicode.org/reports/tr44/#Type_Key_Table]

name::
* cpt.char.Unicode'misc-property,
* cpt.char.Unicode'property.misc,
* cpt.Unicode'property.misc,

Unicode-STANDARD of Unicode-char

description::
· Unicode is a computing-industry standard for the consistent encoding, representation, and handling of computer-chars.
[hmnSgm.2018-06-24]

name::
* cpt.char.Unicode'standard,
* cpt.char.Unicode'Unicode-standard,
* cpt.Unicode,
* cpt.Unicode-standard,

report of Unicode

description::
Unicode Technical Reports cover a wide range of topics related to the implementation or development of the Unicode Standard. These include topics such as:
* normalizing Unicode text for comparison and storage
* collating (sorting) Unicode strings
* determining line break opportunities or other segmentation boundaries in text
* regular expression syntax extensions for Unicode text
* compressing Unicode text
These reports are normatively referenced by a number of international standards and by a wide range of products.
[https://www.unicode.org/reports/about-reports.html]

name::
* cpt.char.Unicode'standard'technical-report,
* cpt.Unicode'report,
* cpt.Unicode'technical-report,
* cpt.Unicode-technical-report,

Unicode-report.SPECIFIC

specific::
* UAX,
* UTS,
* UTR,

Unicode-report.UAX

description::
A Unicode Standard Annex (UAX) forms an integral part of the Unicode Standard, but is published as a separate document.
The Unicode Standard may require conformance to normative content in a Unicode Standard Annex, if so specified in the Conformance chapter of that version of the Unicode Standard.
The version number of a UAX document is always the same as the version of the Unicode Standard of which it forms a part.
[https://www.unicode.org/reports/about-reports.html]

name::
* cpt.Unicode'UAX-Unicode-Standard-Anexx,
* cpt.Unicode-Standard-Anexx-UAX,
* cpt.UAX-Unicode-Standard-Anexx,

specific::
* [UAX9] Unicode Standard Annex #9: Unicode Bidirectional Algorithm, Latest version: http://www.unicode.org/reports/tr9/
* [UAX11] Unicode Standard Annex #11: East Asian Width, Latest version: http://www.unicode.org/reports/tr11/
* [UAX14] Unicode Standard Annex #14: Unicode Line Breaking Algorithm, Latest version: http://www.unicode.org/reports/tr14/
* [UAX15] Unicode Standard Annex #15: Unicode Normalization Forms, Latest version: http://www.unicode.org/reports/tr15/
* [UAX24] Unicode Standard Annex #24: Unicode Script Property, Latest version: http://www.unicode.org/reports/tr24/
* [UAX29] Unicode Standard Annex #29: Unicode Text Segmentation, Latest version: http://www.unicode.org/reports/tr29/
* [UAX31] Unicode Standard Annex #31: Unicode Identifier and Pattern Syntax, Latest version: http://www.unicode.org/reports/tr31/
* [UAX34] Unicode Standard Annex #34: Unicode Named Character Sequences, Latest version: http://www.unicode.org/reports/tr34/
* [UAX38] Unicode Standard Annex #38: Unicode Han Database (Unihan), Latest version: http://www.unicode.org/reports/tr38/
* [UAX41] Unicode Standard Annex #41: Common References for Unicode Standard Annexes, Latest version: http://www.unicode.org/reports/tr41/
* [UAX42] Unicode Standard Annex #42: Unicode Character Database in XML, Latest version: http://www.unicode.org/reports/tr42/
* [UAX44] Unicode Standard Annex #44: Unicode Character Database, Latest version: http://www.unicode.org/reports/tr44/
* [UAX45] Unicode Standard Annex #45: U-Source Ideographs, Latest version: http://www.unicode.org/reports/tr45/
* [UAX50] Unicode Standard Annex #50: Unicode Vertical Text Layout, Latest version: http://www.unicode.org/reports/tr50/

Unicode-report.UTS

description::
A Unicode Technical Standard (UTS) is an independent specification.
Conformance to the Unicode Standard does not imply conformance to any UTS.
[https://www.unicode.org/reports/about-reports.html]

name::
* cpt.Unicode'UTS-Unicode-Techincal-Standard,
* cpt.Unicode-Techincal-Standard-UTS,
* cpt.UTS-Unicode-Technical-Standard,

Unicode-report.UTR

description::
A Unicode Technical Report (UTR) contains informative material.
Conformance to the Unicode Standard does not imply conformance to any UTR.
Other specifications, however, are free to make normative references to a UTR.
[https://www.unicode.org/reports/about-reports.html]
===
· here, there-is a name collision.
· a-Unicode-Technical-Report-UTR is a-type of Unicode-techinical-report!!!
[hmnSgm.2018-06-20]

name::
* cpt.Unicode'UTR-Unicode-Techincal-Report,
* cpt.Unicode-Techincal-Report-UTR,
* cpt.UTR-Unicode-Techincal-Report,

Unicode-Character-Database

description::
The Unicode Standard is far more than a simple encoding of characters.
The standard also associates a rich set of semantics with each encoded character—properties that are required for interoperability and correct behavior in implementations, as well as for Unicode conformance.
These semantics are cataloged in the Unicode Character Database (UCD), a collection of data files which contain the Unicode character code points and character names.
The data files define the Unicode character properties and mappings between Unicode characters (such as case mappings).
[https://www.unicode.org/reports/tr44/tr44-22.html#Introduction]

name::
* cpt.char.Unicode'UCD,
* cpt.char.Unicode'Unicode-database,
* cpt.UCD-Unicode-Character-Database,
* cpt.Unicode'database-(UCD),
* cpt.Unicode-Character-Database-(UCD),

file of UCD

name::
* cpt.char.Unicode'UCD'file,
* cpt.Unicode'UCD'file,

specific::
* ArabicShaping.txt,
* BidiBrackets.txt,
* BidiCharacterTest.txt,
* BidiMirroring.txt,
* BidiTest.txt,
* Blocks.txt,
* CJKRadicals.txt,
* CaseFolding.txt,
* CompositionExclusions.txt,
* DerivedAge.txt,
* DerivedCoreProperties.txt,
* DerivedNormalizationProps.txt,
* EastAsianWidth.txt,
* EmojiSources.txt,
* EquivalentUnifiedIdeograph.txt,
* HangulSyllableType.txt,
* Index.txt,
* IndicPositionalCategory.txt,
* IndicSyllabicCategory.txt,
* Jamo.txt,
* LineBreak.txt,
* NameAliases.txt,
* NamedSequences.txt,
* NamedSequencesProv.txt,
* NamesList.html,
* NamesList.txt,
* NormalizationCorrections.txt,
* NormalizationTest.txt,
* NushuSources.txt,
* PropList.txt,
* PropertyAliases.txt,
* PropertyValueAliases.txt,
* ReadMe.txt,
* ScriptExtensions.txt,
* Scripts.txt,
* SpecialCasing.txt,
* StandardizedVariants.txt,
* TangutSources.txt,
* UCD.zip,
* USourceData.txt,
* USourceGlyphs.pdf,
* UnicodeData.txt,
* Unihan.zip,
* VerticalOrientation.txt,
* auxiliary/,
-- GraphemeBreakProperty.txt,
-- GraphemeBreakTest.html,
-- GraphemeBreakTest.txt,
-- LineBreakTest.html,
-- LineBreakTest.txt,
-- SentenceBreakProperty.txt,
-- SentenceBreakTest.html,
-- SentenceBreakTest.txt,
-- WordBreakProperty.txt,
-- WordBreakTest.html,
-- WordBreakTest.txt,
* extracted/,
-- DerivedBidiClass.txt,
-- DerivedBinaryProperties.txt,
-- DerivedCombiningClass.txt,
-- DerivedDecompositionType.txt,
-- DerivedEastAsianWidth.txt,
-- DerivedGeneralCategory.txt,
-- DerivedJoiningGroup.txt,
-- DerivedJoiningType.txt,
-- DerivedLineBreak.txt,
-- DerivedName.txt,
-- DerivedNumericType.txt,
-- DerivedNumericValues.txt,

resoure of UCD

addressWpg::
* files.latest: ftp://www.unicode.org/Public/UCD/latest/ucd/,
* files.11: ftp://www.unicode.org/Public/11.0.0/ucd/,
* Unicode® Standard Annex #44: UNICODE CHARACTER DATABASE, https://www.unicode.org/reports/tr44/,

Unicode-Han-Database

description::
The Unihan database is the repository for the Unicode Consortium’s collective knowledge regarding the CJK Unified Ideographs contained in the Unicode Standard. It contains mapping data to allow conversion to and from other coded character sets and additional information to help implement support for the various languages which use the Han ideographic script.
Formally, ideographs are defined within the Unicode Standard via their mappings. That is, the Unicode Standard does not formally define what the ideograph U+4E00 is; rather, it defines it as being the equivalent of, say, 0x523B in GB 2312, 0x14421 in CNS 11643, 0x306C in JIS X 0208, and so on.
In practice, implementation of ideographs requires large amounts of ancillary data. Input methods require information such as pronunciations, as do collation algorithms. Data in character sets not included in the world of international standards bodies needs to be converted. Relationships between ideographs need to be defined to allow for fuzzy string matching. Beyond all this, it’s important to track not only what properties a given ideograph has, but who claims it has those properties.
Unlike characters in Western scripts such as Latin and Greek, whose basic property is their sound, which stays largely constant across languages, the basic property for Han ideographs is their meaning. This isn’t to say that ideographs are truly ideographic, in that they represent abstract ideas; but they generally have one root meaning from which the others derive, and generally retain the bulk of their semantic content across linguistic boundaries. Most ideographs are divided into a determinative, which gives a vague sense of meaning, and a phonetic, which gives a vague sense of pronunciation. The Unihan database therefore includes structural analyses and definitions for ideographs.
[https://www.unicode.org/reports/tr38/tr38-25.html]

name::
* cpt.char.Unicode'Unicode-Han-Database,
* cpt.Unicode'Unicode-Han-Database,
* cpt.Unihan-Unicode-Han-Database,

addressWpg::
* https://www.unicode.org/reports/tr38/,

resource of Unicode

name::
* cpt.char.Unicode'standard'resource,
* cpt.Unicode'resource,

addressWpg::
* http://www.unicode.org/,
* 11.0.0: https://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf,

Unicode.SPECIFIC

name::
* cpt.char.Unicode'standard.specific,
* cpt.Unicode.specific,

addressWpg::
* https://www.unicode.org/versions/,

Unicode.11-0-0.2018

description::
The Unicode Consortium. The Unicode Standard, Version 11.0.0, (Mountain View, CA: The Unicode Consortium, 2018. ISBN 978-1-936213-19-1)
http://www.unicode.org/versions/Unicode11.0.0/
===
· script= 146
· char= 137,439 Dogra, Georgian Mtavruli capital letters, Gunjala Gondi, Hanifi Rohingya, Indic Siyaq numbers, Makasar, Medefaidrin, Old Sogdian and Sogdian, Mayan numerals, 5 urgently needed CJK unified ideographs, symbols for xiangqi (Chinese chess) and star ratings, and 145 emoji
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.11-0-0.2018,
* cpt.Unicode.11-0-0.2018,

Unicode.10-0-0.2017

description::
The Unicode Consortium. The Unicode Standard, Version 10.0.0, (Mountain View, CA: The Unicode Consortium, 2017. ISBN 978-1-936213-16-0)
http://www.unicode.org/versions/Unicode10.0.0/
===
· script= 139
· char= 136,755 Zanabazar Square, Soyombo, Masaram Gondi, Nüshu, hentaigana (non-standard hiragana), 7,494 CJK unified ideographs, and 56 emoji
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.10-0-0.2017,
* cpt.Unicode.10-0-0.2017,

Unicode.9-0-0.2016

description::
The Unicode Consortium. The Unicode Standard, Version 9.0.0, (Mountain View, CA: The Unicode Consortium, 2016. ISBN 978-1-936213-13-9)
http://www.unicode.org/versions/Unicode9.0.0/
===
· script= 135
· char= 128,237 Adlam, Bhaiksuki, Marchen, Newa, Osage, Tangut, and 72 emoji
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.9-0-0.2016,
* cpt.Unicode.9-0-0.2016,

Unicode.8-0-0.2015

description::
The Unicode Consortium. The Unicode Standard, Version 8.0.0, (Mountain View, CA: The Unicode Consortium, 2015. ISBN 978-1-936213-10-8)
http://www.unicode.org/versions/Unicode8.0.0/
===
· script= 129
· char= 120,737 Ahom, Anatolian hieroglyphs, Hatran, Multani, Old Hungarian, SignWriting, 5,771 CJK unified ideographs, a set of lowercase letters for Cherokee, and five emoji skin tone modifiers
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.8-0-0.2015,
* cpt.Unicode.8-0-0.2015,

Unicode.7-0-0.2014

description::
The Unicode Consortium. The Unicode Standard, Version 7.0.0, (Mountain View, CA: The Unicode Consortium, 2014. ISBN 978-1-936213-09-2)
http://www.unicode.org/versions/Unicode7.0.0/
===
· script= 123
· char= 113,021 Bassa Vah, Caucasian Albanian, Duployan, Elbasan, Grantha, Khojki, Khudawadi, Linear A, Mahajani, Manichaean, Mende Kikakui, Modi, Mro, Nabataean, Old North Arabian, Old Permic, Pahawh Hmong, Palmyrene, Pau Cin Hau, Psalter Pahlavi, Siddham, Tirhuta, Warang Citi, and Dingbats.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.7-0-0.2014,
* cpt.Unicode.7-0-0.2014,

Unicode.6-3-0.2013

description::
The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5)
http://www.unicode.org/versions/Unicode6.3.0/
===
· script= 100
· char= 110,187 5 bidirectional formatting characters.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.6-3-0.2013,
* cpt.Unicode.6-3-0.2013,

Unicode.6-2-0.2012

description::
The Unicode Consortium. The Unicode Standard, Version 6.2.0, (Mountain View, CA: The Unicode Consortium, 2012. ISBN 978-1-936213-07-8)
http://www.unicode.org/versions/Unicode6.2.0/
===
· script= 100
· char= 110,182 Turkish lira sign.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.6-2-0.2012,
* cpt.Unicode.6-2-0.2012,

Unicode.6-1-0.2012

description::
The Unicode Consortium. The Unicode Standard, Version 6.1.0, (Mountain View, CA: The Unicode Consortium, 2012. ISBN 978-1-936213-02-3)
http://www.unicode.org/versions/Unicode6.1.0/
===
· script= 100
· char= 110,181 Chakma, Meroitic cursive, Meroitic hieroglyphs, Miao, Sharada, Sora Sompeng, and Takri.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.6-1-0.2012,
* cpt.Unicode.6-1-0.2012,

Unicode.6-0-0.2011

description::
The Unicode Consortium. The Unicode Standard, Version 6.0.0, (Mountain View, CA: The Unicode Consortium, 2011. ISBN 978-1-936213-01-6)
http://www.unicode.org/versions/Unicode6.0.0/
===
· script= 93
· char= 109,449 Batak, Brahmi, Mandaic, playing card symbols, transport and map symbols, alchemical symbols, emoticons and emoji. 222 additional CJK Unified Ideographs (CJK-D) added.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.6-0-0.2011,
* cpt.Unicode.6-0-0.2011,

Unicode.5-2-0.2009

description::
The Unicode Consortium. The Unicode Standard, Version 5.2.0 (Mountain View, CA: The Unicode Consortium, 2009. ISBN 978-1-936213-00-9)
http://www.unicode.org/versions/Unicode5.2.0/
===
· script= 90
· char= 107,361 Avestan, Bamum, Egyptian hieroglyphs (the Gardiner Set, comprising 1,071 characters), Imperial Aramaic, Inscriptional Pahlavi, Inscriptional Parthian, Javanese, Kaithi, Lisu, Meetei Mayek, Old South Arabian, Old Turkic, Samaritan, Tai Tham and Tai Viet added. 4,149 additional CJK Unified Ideographs (CJK-C), as well as extended Jamo for Old Hangul, and characters for Vedic Sanskrit.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.5-2-0.2009,
* cpt.Unicode.5-2-0.2009,

Unicode.5-1-0.2007

description::
The Unicode Consortium. The Unicode Standard, Version 5.1.0, defined by: The Unicode Standard, Version 5.0 (Boston, MA, Addison-Wesley, 2007. ISBN 0-321-48091-0), as amended by Unicode 5.1.0
http://www.unicode.org/versions/Unicode5.1.0/
===
· script= 75
· char= 100,713 Carian, Cham, Kayah Li, Lepcha, Lycian, Lydian, Ol Chiki, Rejang, Saurashtra, Sundanese, and Vai added, as well as sets of symbols for the Phaistos Disc, Mahjong tiles, and Domino tiles. There were also important additions for Burmese, additions of letters and Scribal abbreviations used in medieval manuscripts, and the addition of Capital ẞ.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.5-1-0.2007,
* cpt.Unicode.5-1-0.2007,

Unicode.5-0-0.2007

description::
The Unicode Consortium. The Unicode Standard, Version 5.0.0, defined by: The Unicode Standard, Version 5.0 (Boston, MA, Addison-Wesley, 2007. ISBN 0-321-48091-0)
===
· script= 64
· char= 99,089 Balinese, Cuneiform, N'Ko, Phags-pa, and Phoenician added.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.5-0-0.2007,
* cpt.Unicode.5-0-0.2007,

Unicode.4-1-0.2005

description::
· date= March 2005
· ISO= ISO/IEC 10646:2003 plus Amendment 1
· script= 59
· char= 97,720 Buginese, Glagolitic, Kharoshthi, New Tai Lue, Old Persian, Syloti Nagri, and Tifinagh added, and Coptic was disunified from Greek. Ancient Greek numbers and musical symbols were also added.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.4-1-0.2005,
* cpt.Unicode.4-1-0.2005,

Unicode.4-0-1.2003

description::
The Unicode Consortium. The Unicode Standard, Version 4.0.1, defined by: The Unicode Standard, Version 4.0 (Boston, MA, Addison-Wesley, 2003. ISBN 0-321-18578-1), as amended by Unicode 4.0.1
http://www.unicode.org/versions/Unicode4.0.1/

name::
* cpt.char.Unicode'standard.4-0-1.2003,
* cpt.Unicode.4-0-1.2003,

Unicode.4-0-0.2003

description::
· date= April 2003
· book= ISBN 0-321-18578-1
· ISO= ISO/IEC 10646:2003
· script= 52
· char= 96,447 Cypriot syllabary, Limbu, Linear B, Osmanya, Shavian, Tai Le, and Ugaritic added, as well as Hexagram symbols.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.4-0-0.2003,
* cpt.Unicode.4-0-0.2003,

Unicode.3-2-0.2000

description::
The Unicode Consortium, The Unicode Standard, Version 3.2.0,
defined by: The Unicode Standard, Version 3.0 (Reading, MA: Addison-Wesley, 2000. ISBN 0-201-61633-5),
as amended by the Unicode Standard Annex #27: Unicode 3.1 and the Unicode Standard Annex #28: Unicode 3.2
http://www.unicode.org/reports/tr28/
===
· script= 45
· char= 95,221 Philippine scripts Buhid, Hanunó'o, Tagalog, and Tagbanwa added.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.3-2-0.2000,
* cpt.Unicode.3-2-0.2000,

Unicode.3-1-0.2000

description::
The Unicode Consortium, The Unicode Standard, Version 3.1.0,
defined by: The Unicode Standard, Version 3.0 (Reading, MA: Addison-Wesley, 2000. ISBN 0-201-61633-5),
as amended by the Unicode Standard Annex #27: Unicode 3.1
http://www.unicode.org/reports/tr27/
===
· script= 41
· char= 94,205 Deseret, Gothic and Old Italic added, as well as sets of symbols for Western music and Byzantine music, and 42,711 additional CJK Unified Ideographs.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.3-1-0.2000,
* cpt.Unicode.3-1-0.2000,

Unicode.3-0-0.2000

description::
The Unicode Consortium, The Unicode Standard, Version 3.0.0
defined by: The Unicode Standard, Version 3.0 (Reading, MA: Addison-Wesley, 2000. ISBN 0-201-61633-5),
http://www.unicode.org/versions/Unicode3.0.0/
===
· script= 38
· char= 49,259 Cherokee, Ethiopic, Khmer, Mongolian, Burmese, Ogham, Runic, Sinhala, Syriac, Thaana, Unified Canadian Aboriginal Syllabics, and Yi Syllables added, as well as a set of Braille patterns.
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.3-0-0.2000,
* cpt.Unicode.3-0-0.2000,

Unicode.2-1-0.1998

description::
· date= May 1998
· ISO= ISO/IEC 10646-1:1993 plus Amendments 5, 6 and 7, and two characters from Amendment 18
· script= 25
· char= 38,952 Euro sign added.
[http://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.2-1-0.1998,
* cpt.Unicode.2-1-0.1998,

Unicode.2-0-0.1996

description::
· date= July 1996
· book= ISBN 0-201-48345-9
· ISO= ISO/IEC 10646-1:1993 plus Amendments 5, 6 and 7
· script= 25
· char= 38,950 Original set of Hangul syllables removed, and a new set of 11,172 Hangul syllables added at a new location. Tibetan added back in a new location and with a different character repertoire. Surrogate character mechanism defined, and Plane 15 and Plane 16 Private Use Areas allocated.
[http://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.2-0-0.1996,
* cpt.Unicode.2-0-0.1996,

Unicode.1-1-0.1993

description::
· date= June 1993
· ISO= ISO/IEC 10646-1:1993
· script= 24
· char= 34,233 4,306 more Hangul syllables added to original set of 2,350 characters. Tibetan removed.
[http://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.1-1-0.1993,
* cpt.Unicode.1-1-0.1993,

Unicode.1-0-1.1992

description::
· date= June 1992
· book= ISBN 0-201-60845-6 (Vol.2)
· script= 25
· char= 28,359 The initial set of 20,902 CJK Unified Ideographs is defined.
[http://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.1-0-1.1992,
* cpt.Unicode.1-0-1.1992,

Unicode.1-0-0.1991

description::
· date= October 1991
· book= ISBN 0-201-56788-1 (Vol.1)
· script= 24
· char= 7,161 Initial repertoire covers these scripts: Arabic, Armenian, Bengali, Bopomofo, Cyrillic, Devanagari, Georgian, Greek and Coptic, Gujarati, Gurmukhi, Hangul, Hebrew, Hiragana, Kannada, Katakana, Lao, Latin, Malayalam, Oriya, Tamil, Telugu, Thai, and Tibetan.
[http://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.char.Unicode'standard.1-0-0.1991,
* cpt.Unicode.1-0-0.1991,

GENERIC of Unicode-char

generic-chain::
* computer-char,

Unicode-char.SPECIFIC

name::
* cpt.char.Unicode.specific,

specific::
* alphabetic,
* graphic,
* graphicNo,
* math,

Unicode-char.AGGREGATE

description::
· 137,439 v.11-0-0.2018
· 136,755 v.10-0-0.2017
· 128,237 v.9-0-0.2016
· 120,737 v.8-0-0.2015
· 113,021 v.7-0-0.2014
· 110,187 v.6-3-0.2013
· 110,182 v.6-2-0.2012
· 110,181 v.6-1-0.2012
· 109,449 v.6-0-0.2011
· 107,361 v.5-2-0.2009
· 100,713 v.5-1-0.2007
· 99,089 v.5-0-0.2007
· 97,720 v.4-1-0.2005
· 96,447 v.4-0-0.2003
· 95,221 v.3-2-0.2000
· 94,205 v.3-1-0.2000
· 49,259 v.3-0-0.2000
· 38,952 v.2-1-0.1998
· 38,950 v.2-0-0.1996
· 34,233 v.1-1-0.1993
· 28,359 v.1-0-1.1992
· 7,161 v.1-0-0.1991
[https://en.wikipedia.org/wiki/Unicode#Versions]

name::
* cpt.aggregate-Unicode-char,
* cpt.char.Unicode.aggregate,
* cpt.Unicode'char.aggregate,

Unicode-char.SCRIPT

description::
· Unicode-script is A-SET of Unicode-chars used in one or more written-human-languages.
[hmnSgm.2018-06-26]

name::
* cpt.char.Unicode.script,
* cpt.script.Unicode,
* cpt.Unicode'script,
* cpt.Unicode-char.script,
* cpt.Unicode-script,

addressWpg::
* https://en.wikipedia.org/wiki/Script_(Unicode)#List_of_scripts_in_Unicode,

char of Unicode-script

description::
· every Unicode-script contains a-quantity of chars.

directionality of Unicode-script

description::
· L-to-R
· R-to-L
· Varies
· T-to-B (Mongolian)
· Inherited

Unicode-script.SPECIFIC

specific::
* common,
* inherited,
=== aggregate:
· 146 v.11-0-0.2018
· 139 v.10-0-0.2017
· 135 v.9-0-0.2016
· 129 v.8-0-0.2015
· 123 v.7-0-0.2014
· 100 v.6-3-0.2013
· 93 v.6-0-0.2011
· 90 v.5-2-0.2009
· 75 v.5-1-0.2007
· 64 v.5-0-0.2007
· 59 v.4-1-0.2005
· 52 v.4-0-0.2003
· 45 v.3-2-0.2000
· 41 v.3-1-0.2000
· 38 v.3-0-0.2000
· 25 v.2-1-0.1998
· 24 v.1-1-0.1993
· 25 v.1-0-1.1992
· 24 v.1-0-0.1991
[https://en.wikipedia.org/wiki/Unicode#Versions]

Unicode-script.COMMON

description::
Unicode can assign a character in the UCS to a single script only. However, many characters — those that are not part of a formal natural language writing system or are unified across many writing systems may be used in more than one script. For example, currency signs, symbols, numerals and punctuation marks. In these cases Unicode defines them as belonging to the "common" script (ISO 15924 code "Zyyy").
[https://en.wikipedia.org/wiki/Script_(Unicode)#Special_script_property_values {2018-06-26}]

name::
* cpt.common-script.Unicode,
* cpt.Unicode'script.common,

Unicode-script.INHERITED

description::
Many diacritics and non-spacing combining characters may be applied to characters from more than one script. In these cases Unicode assigns them to the "inherited" script (ISO 15924 code Zinh), which means that they have the same script class as the base character with which they combine, and so in different contexts they may be treated as belonging to different scripts. For example, U+0308 ̈ Combining Diaeresis may combine with either U+0065 e Latin Small Letter E to create a Latin "ë", or with U+0435 е Cyrillic Small Letter IE for the Cyrillic "ё". In the former case it inherits the Latin script of the base character whereas in the latter case it inherits the Cyrillic script of the base character.
[https://en.wikipedia.org/wiki/Script_(Unicode)#Special_script_property_values, {2018-06-26}]

name::
* cpt.inherited-script.Unicode,
* cpt.Unicode'script.inherited,

Unicode-char.GRAPHIC

description::
In Unicode, Graphic characters are those with General Category Letter, Mark, Number, Punctuation, Symbol or Zs=space.
[https://en.wikipedia.org/wiki/Graphic_character#Unicode]

name::
* cpt.char.graphic.Unicode,
* cpt.char.Unicode.graphic,
* cpt.graphic-Unicode-char,
* cpt.Unicode'char.graphic,

char.HTML

description::
· Html-char is a-char in a-format used in the-Html-computer-language.
[hmnSgm.2018-06-22]

name::
* cpt.char.Html,
* cpt.Html-char,
* cpt.Html-character-reference,

Html-char.DECIMAL

description::
· all Unicode-chars can-be-written in Html, using its Unicode-decimal-code-point as: &#DECIMALCP;

name::
* cpt.char.Html.decimal,
* cpt.decimal-Html-char,
* cpt.Html-char.decimal,

Html-char.HEXADECIMAL

description::
· all Unicode-chars can-be-written in Html, using its Unicode-hexadecimal-code-point as: &#xHEXCP;

name::
* cpt.char.Html.hexadecimal,
* cpt.hexadecimal-Html-char,
* cpt.Html-char.hexadecimal,

Html-char.NAMED

description::
· some chars are-written with a-name: &NAME;

name::
* cpt.char.Html.named,
* cpt.Html-char.named,
* cpt.named-Html-char,

char.ESCAPE

description::
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the judgment of whether something is an escape character or not depends on context.
[https://en.wikipedia.org/wiki/Escape_character {2018-06-25}]

name::
* cpt.char.escape,
* cpt.escape-char,

char.GRAPHIC

description::
In ISO/IEC 646 (commonly known as ASCII) and related standards including ISO 8859 and Unicode, a graphic character is any character intended to be written, printed, or otherwise displayed in a form that can be read by humans. In other words, it is any encoded character that is associated with one or more glyphs.
[https://en.wikipedia.org/wiki/Graphic_character {2018-06-25}]

name::
* cpt.char.glyph,
* cpt.char.graphic,
* cpt.glyph-char,
* cpt.graphic-char,

specific::
* Unicode-graphic-char,

char.GRAPHIC.NO

description::
· graphicNo-char is a-computer-char WITHOUT a-written or printed icon associated with it.
· all chars of written-human-languages are graphic-chars.
[hmnSgm.2018-06-26]

name::
* cpt.char.glyphNo,
* cpt.char.graphicNo,
* cpt.glyphNo-char,
* cpt.graphicNo-char,

char.EVOLUTING

name::
* cpt.char.evoluting,

{time.2018-06-19}::
=== creation of this structured-concept:

meta-info

dirLag was visited times since {2017-07-01}

page-path: synagonism.net / Mcs-worldview / dirLag / char

SEARCH::
· this page uses 'locator-names', names that when you find them, you find the-LOCATION of the-concept they denote.
LOCAL-SEARCH:
· TYPE CTRL+F "cpt.words-of-concept's-name", to go to the-LOCATION of the-concept.
GLOBAL-SEARCH:
· clicking on the-green-TITLE of a-page you have access to the-global--locator-names of my-site.
· a-preview of the-description of a-global-name makes reading fast.

footer::
• author: Kaseluris.Nikos.1959
• email:
 imgMail
• twitter: @synagonism
• steemit: https://steemit.com/@synagonism

webpage-versions::
• version.last.dynamic: filMcsChar.last.html,
• version.0-1-0.2018-06-19.created: filMcsChar.0-1-0.2018-06-19.html,

support (link)

comments

specific::
* on Disqus,