Index of values

boundaries b t are the positions of boundaries b in t.

boundaries_mandatory is like Utext.boundaries but returns the mandatory status of a boundary if the kind of boundary sports that notion (or always true if not).

buffer_add_utf_16be [Utext]

buffer_add_utf_16be b t adds the UTF-16BE encoding of t to b.

buffer_add_utf_16le [Utext]

buffer_add_utf_16le b t adds the UTF-16LE encoding of t to b.

buffer_add_utf_8 [Utext]

buffer_add_utf_8 b t adds the UTF-8 encoding of t to b.

canonical_caseless_key [Utext]

canonical_caseless_key t is a key such that equal (canonical_caseless_key t0) (canonical_caseless_key t1) determines canonical caseless equality (TUS D145) of t0 and t1.

capitalized [Utext]

capitalized t is t capitalized: if the first character of t is cased it is mapped to its title case mapping; otherwise t is left unchanged.

casefolded [Utext]

casefold t is t casefolded according to Unicode's default casefold.

compare [Utext]

compare t0 t1 is the per element lexicographical order between t0 and t1.

compatibility_caseless_key [Utext]

compatability_caseless_key t is a key such that equal (compatibility_caseless_key t0) (compatibility_caseless_key t1) determines compatibility caseless equality (TUS D146) of t0 and t1.

empty [Utext]

empty is Pvec.empty, the empty Unicode text.

encoding_guess [Utext]

encoding_guess s is the encoding guessed for s coupled with true iff there's an initial BOM.

equal [Utext]

equal t0 t1 is true if the elements in each vector are equal.

escaped [Utext]

escaped t is t except characters whose general category is Control, U+0022 or U+005C which are escaped according to OCaml's lexical conventions for strings with: Any U+0008 ('\b') escaped to the sequence <U+005C, U+0062> ("\\b"), Any U+0009 ('\t') escaped to the sequence <U+005C, U+0074> ("\\t"), Any U+000A ('\n') escaped to the sequence <U+005C, U+006E> "\\n", Any U+000D ('\r') escaped to the sequence <U+005C, U+0072> ("\\r"), Any U+0022 ('\"') escaped to the sequence <U+005C, U+0022> ("\\\""), Any U+005C ('\\') escaped to the sequence <U+005C, U+005C> ("\\\\"), Any other character is escaped by an hexadecimal "\u{H+}" escape with H a capital hexadecimal number.

identifier_caseless_key [Utext]

identifier_caseless_key t is a key such that equal (identifier_caseless_key t0) (identifier_caseless_key t1) determines identifier caseless equality (TUS D147) of t0 and t1.

init [Utext]

init ~len f is Pvec.init ~len f.

is_empty [Utext]

is_empty t is true if t is empty, this is equal to Pvec.is_empty.

is_identifier [Utext]

is_identifier t is true iff t is a Default Unicode identifier, more precisely this is UAX31-R1.

is_normalized [Utext]

is_normalized nf t is true iff t is in normalization form nf.

lines [Utext]

lines ~drop_empty ~newline t breaks t into subtexts separated by newlines determined according to newline (defaults to `Readline).

lowercased [Utext]

lowercase t is t lowercased according to Unicode's default case conversion.

normalized [Utext]

normalized nf t is t normalized to nf.

of_uchar [Utext]

of_uchar u is Pvec.singleton u.

of_utf_16be [Utext]

of_utf_16be ~first ~last s is like Utext.of_utf_8 but decodes UTF-16BE.

of_utf_16le [Utext]

of_utf_16le ~first ~last s is like Utext.of_utf_8 but decodes UTF-16LE.

of_utf_8 [Utext]

of_utf_8 ~first ~last s is the Unicode text that results of best-effort UTF-8 decoding the bytes of s that exist in the range [first;last].

paragraphs [Utext]

paragraphs ~newline t breaks t into subtexts separated either by two consecutive newlines (determined as `NLF or LS (U+2028)) or a single PS (U+2029).

pp [Utext]

pp ppf t prints the UTF-8 encoding of t instructing the ppf to use a length of 1 for each grapheme cluster of t.

pp_lines [Utext]

pp_lines ppf t is like Utext.pp except only mandatory line breaks are hinted to the formatter, see Uuseg_string.pp_utf_8_lines for details.

pp_text [Utext]

pp_text ppf t is like Utext.pp except each line breaks is hinted to the formatter, see Uuseg_string.pp_utf_8_text for details.

pp_toplevel [Utext]

pp_toplevel ppf t formats t using Utext.escaped and Utext.pp in a manner suitable for the toplevel to represent a Utext.t value.

pp_toplevel_pvec [Utext]

pp_toplevel_pvec ppf ts formats ts using Utext.pp_toplevel.

pp_uchars [Utext]

dump_uchars ppf t formats t as a sequence of OCaml Uchar.t value using only US-ASCII encoded characters according to the Unicode notational convention for code points.

segment_count [Utext]

segment_count b t is Pvec.length (segments b t).

segments [Utext]

segments b t is are the segments of text t delimited by two boundaries of type b.

str [Utext]

str s is Unicode text from the valid UTF-8 encoded bytes s.

strf [Utext]

strf fmt ... is Format.kasprintf (fun s -> str s) fmt ...).

to_utf_16be [Utext]

to_utf_16be t is the UTF-16BE encoding of t.

to_utf_16le [Utext]

to_utf_16le t is the UTF-16LE encoding of t.

to_utf_8 [Utext]

to_utf_8 t is the UTF-8 encoding of t.

try_of_utf_16le [Utext]

try_of_utf_16be is like Utext.try_of_utf_8 but decodes UTF-16BE.

try_of_utf_8 [Utext]

try_of_utf_8 is like Utext.of_utf_8 except in case of error Error _ is returned as described in decode_result.

uncapitalized [Utext]

uncapitalized t is t uncapitalized: if the first character of t is cased it is mapped to its lowercase case mapping; otherwise t is left unchanged.

unescaped [Utext]

unescaped s unescapes what Utext.escaped did and any other valid \u{H+} escape.

unicode_version [Utext]

unicode_version is the Unicode version supported by Utext.

uppercased [Utext]

uppercase t is t uppercased according to Unicode's default case conversion.

v [Utext]

v ~len u is Pvec.v ~len u.

B
boundaries [Utext]	`boundaries b t` are the positions of boundaries `b` in `t`.
boundaries_mandatory [Utext]	`boundaries_mandatory` is like `Utext.boundaries` but returns the mandatory status of a boundary if the kind of boundary sports that notion (or always `true` if not).
buffer_add_utf_16be [Utext]	`buffer_add_utf_16be b t` adds the UTF-16BE encoding of `t` to `b`.
buffer_add_utf_16le [Utext]	`buffer_add_utf_16le b t` adds the UTF-16LE encoding of `t` to `b`.
buffer_add_utf_8 [Utext]	`buffer_add_utf_8 b t` adds the UTF-8 encoding of `t` to `b`.
C
canonical_caseless_key [Utext]	`canonical_caseless_key t` is a key such that `equal (canonical_caseless_key t0) (canonical_caseless_key t1)` determines canonical caseless equality (TUS D145) of `t0` and `t1`.
capitalized [Utext]	`capitalized t` is `t` capitalized: if the first character of `t` is cased it is mapped to its title case mapping; otherwise `t` is left unchanged.
casefolded [Utext]	`casefold t` is `t` casefolded according to Unicode's default casefold.
compare [Utext]	`compare t0 t1` is the per element lexicographical order between `t0` and `t1`.
compatibility_caseless_key [Utext]	`compatability_caseless_key t` is a key such that `equal (compatibility_caseless_key t0) (compatibility_caseless_key t1)` determines compatibility caseless equality (TUS D146) of `t0` and `t1`.
E
empty [Utext]	`empty` is `Pvec.empty`, the empty Unicode text.
encoding_guess [Utext]	`encoding_guess s` is the encoding guessed for `s` coupled with `true` iff there's an initial BOM.
equal [Utext]	`equal t0 t1` is `true` if the elements in each vector are equal.
escaped [Utext]	`escaped t` is `t` except characters whose general category is `Control`, U+0022 or U+005C which are escaped according to OCaml's lexical conventions for strings with: Any U+0008 (`'\b'`) escaped to the sequence <U+005C, U+0062> (`"\\b"`), Any U+0009 (`'\t'`) escaped to the sequence <U+005C, U+0074> (`"\\t"`), Any U+000A (`'\n'`) escaped to the sequence <U+005C, U+006E> `"\\n"`, Any U+000D (`'\r'`) escaped to the sequence <U+005C, U+0072> (`"\\r"`), Any U+0022 (`'\"'`) escaped to the sequence <U+005C, U+0022> (`"\\\""`), Any U+005C (`'\\'`) escaped to the sequence <U+005C, U+005C> (`"\\\\"`), Any other character is escaped by an hexadecimal `"\u{H+}"` escape with `H` a capital hexadecimal number.
I
identifier_caseless_key [Utext]	`identifier_caseless_key t` is a key such that `equal (identifier_caseless_key t0) (identifier_caseless_key t1)` determines identifier caseless equality (TUS D147) of `t0` and `t1`.
init [Utext]	`init ~len f` is `Pvec.init` `~len f`.
is_empty [Utext]	`is_empty t` is `true` if `t` is empty, this is equal to `Pvec.is_empty.`
is_identifier [Utext]	`is_identifier t` is `true` iff `t` is a Default Unicode identifier, more precisely this is UAX31-R1.
is_normalized [Utext]	`is_normalized nf t` is `true` iff `t` is in normalization form `nf`.
L
lines [Utext]	`lines ~drop_empty ~newline t` breaks `t` into subtexts separated by newlines determined according to `newline` (defaults to `Readline).
lowercased [Utext]	`lowercase t` is `t` lowercased according to Unicode's default case conversion.
N
normalized [Utext]	`normalized nf t` is `t` normalized to `nf`.
O
of_uchar [Utext]	`of_uchar u` is `Pvec.singleton` `u`.
of_utf_16be [Utext]	`of_utf_16be ~first ~last s` is like `Utext.of_utf_8` but decodes UTF-16BE.
of_utf_16le [Utext]	`of_utf_16le ~first ~last s` is like `Utext.of_utf_8` but decodes UTF-16LE.
of_utf_8 [Utext]	`of_utf_8 ~first ~last s` is the Unicode text that results of best-effort UTF-8 decoding the bytes of `s` that exist in the range [`first`;`last`].
P
paragraphs [Utext]	`paragraphs ~newline t` breaks `t` into subtexts separated either by two consecutive newlines (determined as `NLF or LS (U+2028)) or a single PS (U+2029).
pp [Utext]	`pp ppf t` prints the UTF-8 encoding of `t` instructing the `ppf` to use a length of `1` for each grapheme cluster of `t`.
pp_lines [Utext]	`pp_lines ppf t` is like `Utext.pp` except only mandatory line breaks are hinted to the formatter, see `Uuseg_string.pp_utf_8_lines` for details.
pp_text [Utext]	`pp_text ppf t` is like `Utext.pp` except each line breaks is hinted to the formatter, see `Uuseg_string.pp_utf_8_text` for details.
pp_toplevel [Utext]	`pp_toplevel ppf t` formats `t` using `Utext.escaped` and `Utext.pp` in a manner suitable for the toplevel to represent a `Utext.t` value.
pp_toplevel_pvec [Utext]	`pp_toplevel_pvec ppf ts` formats `ts` using `Utext.pp_toplevel`.
pp_uchars [Utext]	`dump_uchars ppf t` formats `t` as a sequence of OCaml `Uchar.t` value using only US-ASCII encoded characters according to the Unicode notational convention for code points.
S
segment_count [Utext]	`segment_count b t` is `Pvec.length (segments b t)`.
segments [Utext]	`segments b t` is are the segments of text `t` delimited by two boundaries of type `b`.
str [Utext]	`str s` is Unicode text from the valid UTF-8 encoded bytes `s`.
strf [Utext]	`strf fmt ...` is `Format.kasprintf (fun s -> str s) fmt ...)`.
T
to_utf_16be [Utext]	`to_utf_16be t` is the UTF-16BE encoding of `t`.
to_utf_16le [Utext]	`to_utf_16le t` is the UTF-16LE encoding of `t`.
to_utf_8 [Utext]	`to_utf_8 t` is the UTF-8 encoding of `t`.
try_of_utf_16le [Utext]	`try_of_utf_16be` is like `Utext.try_of_utf_8` but decodes UTF-16BE.
try_of_utf_8 [Utext]	`try_of_utf_8` is like `Utext.of_utf_8` except in case of error `Error _` is returned as described in `decode_result`.
U
uncapitalized [Utext]	`uncapitalized t` is `t` uncapitalized: if the first character of `t` is cased it is mapped to its lowercase case mapping; otherwise `t` is left unchanged.
unescaped [Utext]	`unescaped s` unescapes what `Utext.escaped` did and any other valid `\u{H+}` escape.
unicode_version [Utext]	`unicode_version` is the Unicode version supported by `Utext`.
uppercased [Utext]	`uppercase t` is `t` uppercased according to Unicode's default case conversion.
V
v [Utext]	`v ~len u` is `Pvec.v` `~len u`.