B | |
boundaries [Utext] | boundaries b t are the positions of boundaries b in
t .
|
boundaries_mandatory [Utext] | boundaries_mandatory is like Utext.boundaries but returns
the mandatory status of a boundary if the kind of boundary
sports that notion (or always true if not).
|
buffer_add_utf_16be [Utext] | buffer_add_utf_16be b t adds the UTF-16BE encoding of t to b .
|
buffer_add_utf_16le [Utext] | buffer_add_utf_16le b t adds the UTF-16LE encoding of t to b .
|
buffer_add_utf_8 [Utext] | buffer_add_utf_8 b t adds the UTF-8 encoding of t to b .
|
C | |
canonical_caseless_key [Utext] | canonical_caseless_key t is a key such that
equal (canonical_caseless_key t0) (canonical_caseless_key t1)
determines canonical caseless
equality (TUS
D145) of t0 and t1 .
|
capitalized [Utext] | capitalized t is t capitalized: if the first character of t
is cased it is mapped to its
title case mapping; otherwise t is
left unchanged.
|
casefolded [Utext] | casefold t is t casefolded according to Unicode's default
casefold.
|
compare [Utext] | compare t0 t1 is the per element lexicographical order between
t0 and t1 .
|
compatibility_caseless_key [Utext] | compatability_caseless_key t is a key such that
equal (compatibility_caseless_key t0) (compatibility_caseless_key t1)
determines compatibility caseless
equality (TUS
D146) of t0 and t1 .
|
E | |
empty [Utext] | empty is Pvec.empty , the empty Unicode text.
|
encoding_guess [Utext] | |
equal [Utext] | equal t0 t1 is true if the elements in each vector are
equal.
|
escaped [Utext] | escaped t is t except characters whose general category is
Control , U+0022 or U+005C which are escaped according to OCaml's
lexical conventions for strings with:
Any U+0008 ('\b' ) escaped to the sequence <U+005C, U+0062>
("\\b" ), Any U+0009 ('\t' ) escaped to the sequence <U+005C, U+0074>
("\\t" ), Any U+000A ('\n' ) escaped to the sequence <U+005C, U+006E>
"\\n" , Any U+000D ('\r' ) escaped to the sequence <U+005C, U+0072>
("\\r" ), Any U+0022 ('\"' ) escaped to the sequence <U+005C, U+0022>
("\\\"" ), Any U+005C ('\\' ) escaped to the sequence <U+005C, U+005C>
("\\\\" ), Any other character is escaped by an hexadecimal "\u{H+}" escape
with H a capital hexadecimal number.
|
I | |
identifier_caseless_key [Utext] | identifier_caseless_key t is a key such that
equal (identifier_caseless_key t0) (identifier_caseless_key t1)
determines identifier caseless
equality (TUS
D147) of t0 and t1 .
|
init [Utext] | init ~len f is Pvec.init ~len f .
|
is_empty [Utext] | is_empty t is true if t is empty, this is equal to
Pvec.is_empty.
|
is_identifier [Utext] | |
is_normalized [Utext] | is_normalized nf t is true iff t is in normalization form nf .
|
L | |
lines [Utext] | lines ~drop_empty ~newline t breaks t into subtexts separated
by newlines determined according to newline (defaults to
`Readline ).
|
lowercased [Utext] | lowercase t is t lowercased according to Unicode's default case
conversion.
|
N | |
normalized [Utext] | normalized nf t is t normalized to nf .
|
O | |
of_uchar [Utext] | of_uchar u is Pvec.singleton u .
|
of_utf_16be [Utext] | of_utf_16be ~first ~last s is like Utext.of_utf_8 but decodes
UTF-16BE.
|
of_utf_16le [Utext] | of_utf_16le ~first ~last s is like Utext.of_utf_8 but decodes
UTF-16LE.
|
of_utf_8 [Utext] | of_utf_8 ~first ~last s is the Unicode text that results of
best-effort UTF-8 decoding the bytes of s that exist in the
range [first ;last ].
|
P | |
paragraphs [Utext] | paragraphs ~newline t breaks t into subtexts separated either
by two consecutive newlines (determined as `NLF or
LS (U+2028)) or a single PS (U+2029).
|
pp [Utext] | pp ppf t prints the UTF-8 encoding of t instructing the ppf
to use a length of 1 for each grapheme cluster of t .
|
pp_lines [Utext] | pp_lines ppf t is like Utext.pp except only mandatory line breaks
are hinted to the formatter, see Uuseg_string.pp_utf_8_lines for
details.
|
pp_text [Utext] | pp_text ppf t is like Utext.pp except each line breaks is hinted
to the formatter, see Uuseg_string.pp_utf_8_text for details.
|
pp_toplevel [Utext] | pp_toplevel ppf t formats t using Utext.escaped and Utext.pp in a manner
suitable for the toplevel to represent a Utext.t value.
|
pp_toplevel_pvec [Utext] | |
pp_uchars [Utext] | dump_uchars ppf t formats t as a sequence of OCaml Uchar.t value
using only US-ASCII encoded characters according to the Unicode
notational convention for code points.
|
S | |
segment_count [Utext] | segment_count b t is Pvec.length (segments b t) .
|
segments [Utext] | segments b t is are the segments of text t delimited by two
boundaries of type b .
|
str [Utext] | str s is Unicode text from the valid UTF-8 encoded bytes s .
|
strf [Utext] | strf fmt ... is Format.kasprintf (fun s -> str s) fmt ...) .
|
T | |
to_utf_16be [Utext] | to_utf_16be t is the UTF-16BE encoding of t .
|
to_utf_16le [Utext] | to_utf_16le t is the UTF-16LE encoding of t .
|
to_utf_8 [Utext] | to_utf_8 t is the UTF-8 encoding of t .
|
try_of_utf_16le [Utext] | try_of_utf_16be is like Utext.try_of_utf_8 but decodes UTF-16BE.
|
try_of_utf_8 [Utext] | try_of_utf_8 is like Utext.of_utf_8 except in case of error
Error _ is returned as described in decode_result .
|
U | |
uncapitalized [Utext] | uncapitalized t is t uncapitalized: if the first character of
t is cased it is mapped to its
lowercase case mapping; otherwise t
is left unchanged.
|
unescaped [Utext] | |
unicode_version [Utext] | unicode_version is the Unicode version supported by Utext .
|
uppercased [Utext] | uppercase t is t uppercased according to Unicode's default case
conversion.
|
V | |
v [Utext] | v ~len u is Pvec.v ~len u .
|