system.unicode
The system.unicode table is a virtual table that provides information about Unicode characters and their properties(https://unicode-org.github.io/icu/userguide/strings/properties.html). This table is generated on-the-fly.
Columns
Note
The property names of Unicode code points in ICU documents are converted to snake case.
code_point(String) — The UTF-8 representation of the code point.code_point_value(Int32) — The numeric value of the code point.notation(String) — The Unicode notation of the code point.- Binary Properties (UInt8) - The binary properties of the code point.
alphabetic,ascii_hex_digit,case_ignorable...
- Enumerated Properties (Int32) - The enumerated properties of the code point.
bidi_class,bidi_paired_bracket_type,block...
- String Properties (String) - The string properties(ASCII String or Unicode String or code point) of the code point
case_folding,decomposition_mapping,name...
Note
Mapping is somewhat special, see the icu documentation. For example, simple_uppercase_mapping and uppercase_mapping are not exactly the same. But no language-specific mappings are implemented (e.g. Turkish the upper case of i is "İ" (U+0130))
numeric_value(Float64) - The numeric value of the code point.script_extensions(Array(LowCardinality(String))) - The script extensions of the code point.identifier_type(Array(LowCardinality(String))) - The identifier type of the code point.general_category_mask(Int32) - The general category mask of the code point.
Example