Index of values

boundaries [Utext]
boundaries b t are the positions of boundaries b in t.
boundaries_mandatory [Utext]
boundaries_mandatory is like Utext.boundaries but returns the mandatory status of a boundary if the kind of boundary sports that notion (or always true if not).
buffer_add_utf_16be [Utext]
buffer_add_utf_16be b t adds the UTF-16BE encoding of t to b.
buffer_add_utf_16le [Utext]
buffer_add_utf_16le b t adds the UTF-16LE encoding of t to b.
buffer_add_utf_8 [Utext]
buffer_add_utf_8 b t adds the UTF-8 encoding of t to b.

canonical_caseless_key [Utext]
canonical_caseless_key t is a key such that equal (canonical_caseless_key t0) (canonical_caseless_key t1) determines canonical caseless equality (TUS D145) of t0 and t1.
capitalized [Utext]
capitalized t is t capitalized: if the first character of t is cased it is mapped to its title case mapping; otherwise t is left unchanged.
casefolded [Utext]
casefold t is t casefolded according to Unicode's default casefold.
compare [Utext]
compare t0 t1 is the per element lexicographical order between t0 and t1.
compatibility_caseless_key [Utext]
compatability_caseless_key t is a key such that equal (compatibility_caseless_key t0) (compatibility_caseless_key t1) determines compatibility caseless equality (TUS D146) of t0 and t1.

empty [Utext]
empty is Pvec.empty, the empty Unicode text.
encoding_guess [Utext]
encoding_guess s is the encoding guessed for s coupled with true iff there's an initial BOM.
equal [Utext]
equal t0 t1 is true if the elements in each vector are equal.
escaped [Utext]
escaped t is t except characters whose general category is Control, U+0022 or U+005C which are escaped according to OCaml's lexical conventions for strings with: Any U+0008 ('\b') escaped to the sequence <U+005C, U+0062> ("\\b"), Any U+0009 ('\t') escaped to the sequence <U+005C, U+0074> ("\\t"), Any U+000A ('\n') escaped to the sequence <U+005C, U+006E> "\\n", Any U+000D ('\r') escaped to the sequence <U+005C, U+0072> ("\\r"), Any U+0022 ('\"') escaped to the sequence <U+005C, U+0022> ("\\\""), Any U+005C ('\\') escaped to the sequence <U+005C, U+005C> ("\\\\"), Any other character is escaped by an hexadecimal "\u{H+}" escape with H a capital hexadecimal number.

identifier_caseless_key [Utext]
identifier_caseless_key t is a key such that equal (identifier_caseless_key t0) (identifier_caseless_key t1) determines identifier caseless equality (TUS D147) of t0 and t1.
init [Utext]
init ~len f is Pvec.init ~len f.
is_empty [Utext]
is_empty t is true if t is empty, this is equal to Pvec.is_empty.
is_identifier [Utext]
is_identifier t is true iff t is a Default Unicode identifier, more precisely this is UAX31-R1.
is_normalized [Utext]
is_normalized nf t is true iff t is in normalization form nf.

lines [Utext]
lines ~drop_empty ~newline t breaks t into subtexts separated by newlines determined according to newline (defaults to `Readline).
lowercased [Utext]
lowercase t is t lowercased according to Unicode's default case conversion.

normalized [Utext]
normalized nf t is t normalized to nf.

of_uchar [Utext]
of_uchar u is Pvec.singleton u.
of_utf_16be [Utext]
of_utf_16be ~first ~last s is like Utext.of_utf_8 but decodes UTF-16BE.
of_utf_16le [Utext]
of_utf_16le ~first ~last s is like Utext.of_utf_8 but decodes UTF-16LE.
of_utf_8 [Utext]
of_utf_8 ~first ~last s is the Unicode text that results of best-effort UTF-8 decoding the bytes of s that exist in the range [first;last].

paragraphs [Utext]
paragraphs ~newline t breaks t into subtexts separated either by two consecutive newlines (determined as `NLF or LS (U+2028)) or a single PS (U+2029).
pp [Utext]
pp ppf t prints the UTF-8 encoding of t instructing the ppf to use a length of 1 for each grapheme cluster of t.
pp_lines [Utext]
pp_lines ppf t is like Utext.pp except only mandatory line breaks are hinted to the formatter, see Uuseg_string.pp_utf_8_lines for details.
pp_text [Utext]
pp_text ppf t is like Utext.pp except each line breaks is hinted to the formatter, see Uuseg_string.pp_utf_8_text for details.
pp_toplevel [Utext]
pp_toplevel ppf t formats t using Utext.escaped and Utext.pp in a manner suitable for the toplevel to represent a Utext.t value.
pp_toplevel_pvec [Utext]
pp_toplevel_pvec ppf ts formats ts using Utext.pp_toplevel.
pp_uchars [Utext]
dump_uchars ppf t formats t as a sequence of OCaml Uchar.t value using only US-ASCII encoded characters according to the Unicode notational convention for code points.

segment_count [Utext]
segment_count b t is Pvec.length (segments b t).
segments [Utext]
segments b t is are the segments of text t delimited by two boundaries of type b.
str [Utext]
str s is Unicode text from the valid UTF-8 encoded bytes s.
strf [Utext]
strf fmt ... is Format.kasprintf (fun s -> str s) fmt ...).

to_utf_16be [Utext]
to_utf_16be t is the UTF-16BE encoding of t.
to_utf_16le [Utext]
to_utf_16le t is the UTF-16LE encoding of t.
to_utf_8 [Utext]
to_utf_8 t is the UTF-8 encoding of t.
try_of_utf_16le [Utext]
try_of_utf_16be is like Utext.try_of_utf_8 but decodes UTF-16BE.
try_of_utf_8 [Utext]
try_of_utf_8 is like Utext.of_utf_8 except in case of error Error _ is returned as described in decode_result.

uncapitalized [Utext]
uncapitalized t is t uncapitalized: if the first character of t is cased it is mapped to its lowercase case mapping; otherwise t is left unchanged.
unescaped [Utext]
unescaped s unescapes what Utext.escaped did and any other valid \u{H+} escape.
unicode_version [Utext]
unicode_version is the Unicode version supported by Utext.
uppercased [Utext]
uppercase t is t uppercased according to Unicode's default case conversion.

v [Utext]
v ~len u is Pvec.v ~len u.