Jstr
JavaScript strings
val v : string -> t
v s
is the UTF-8 encoded OCaml string s
as a JavaScript string.
val length : t -> int
length s
is the length of s
.
val get : t -> int -> Stdlib.Uchar.t
get s i
is the Unicode character at position i
in s
. If this happens to be a lone low or any high surrogate surrogate, Uchar.rep
is returned. Raises Invalid_argument
if i
is out of bounds.
val empty : t
empty
is an empty string.
val sp : t
sp
is Jstr.v " "
.
val nl : t
nl
is Jstr.v "\n"
.
concat ?sep ss
is the concatenates the list of strings ss
inserting sep
between each of them (defaults to empty
).
pad_start ~pad n s
is s
with pad
strings prepended to s
until the length of the result is n
or s
if length s >= n
. The first prepended pad
may be truncated to satisfy the constraint. pad
defaults to sp
.
Warning. Since length
is neither the number of Unicode characters of s
nor its number of grapheme clusters, if you are using this for visual layout, it will fail in many cases. At least consider normalizing s
to `NFC
before.
pad_end ~pad n s
is s
with pad
strings appended to s
until the length
of the result is n
or s
if length s >= n
. The last appended pad
may be truncated to satisfy the constraint. pad
defaults to sp
.
Warning. Since length
is neither the number of Unicode characters of s
nor its number of grapheme clusters, if you are using this for visual layout, it will fail in many cases. At least consider normalizing s
to `NFC
before.
find_sub ~start ~sub s
is the start index (if any) of the first occurence of sub
in s
at or after start
.
find_last_sub ~before ~sub s
is the start index (if any) of the last occurence of sub
in s
before before
(defaults to length
s
).
slice ~start ~stop s
is the string s.start
, s.start+1
, ... s.stop - 1
. start
defaults to 0
and stop
to length s
.
If start
or stop
are negative they are subtracted from length s
. This means that -1
denotes the last character of the string.
sub ~start ~len s
is the string s.start
, ... s.start + len - 1
. start
default to 0
and len
to length s - start
.
If start
is negative it is subtracted from length s
. This means that -1
denotes the last character of the string. If len
is negative it is treated as 0
.
cuts sep s
is the list of all (possibly empty) substrings of s
that are delimited by matches of the non empty separator string sep
.
val fold_uchars : (Stdlib.Uchar.t -> 'a -> 'a) -> t -> 'a -> 'a
fold_uchars f acc s
folds f
over the Unicode characters of s
starting with acc
. Decoding errors (that is unpaired UTF-16 surrogates) are reported as Uchar.rep
.
fold_jstr_uchars
is like fold_uchars
but the characters are given as strings.
For more information on normalization consult a short introduction, the UAX #15 Unicode Normalization Forms and normalization charts.
The type for normalization forms.
`NFD
normalization form D, canonical decomposition.`NFC
normalization form C, canonical decomposition followed by canonical composition.`NFKD
normalization form KD, compatibility decomposition.`NFKC
normalization form KC, compatibility decomposition, followed by canonical composition.val normalized : normalization -> t -> t
normalized nf t
is t
normalized to nf
.
For more information about case see the Unicode case mapping FAQ and the case mapping charts. Note that these algorithms are insensitive to language and context and may produce sub-par results for some users.
val is_empty : t -> bool
is_empty s
is true
iff s
is an empty string.
starts_with ~prefix s
is true
iff s
starts with prefix
(as per equal
).
ends_with ~suffix s
is true
iff s
ends with suffix
(as per equal
).
equal s0 s1
is true
iff s0
and s1
are equal. Warning. Unless s0
and s1
are known to be in a particular normal form the test is textually meaningless.
compare s0 s1
is a total order on strings compatible with equal
. Warning. The comparison is textually meaningless.
val of_uchar : Stdlib.Uchar.t -> t
of_uchar u
is a string made of u
.
val of_char : char -> t
of_char c
is a string made of c
.
val to_string : t -> string
to_string s
is s
as an UTF-8 encoded OCaml string.
val of_string : string -> t
of_string s
is the UTF-8 encoded OCaml string s
as a JavaScript string.
val binary_to_octets : t -> string
binary_to_octets s
is the JavaScript binary string s
as an OCaml string of bytes. In s
each 16-bit JavaScript character encodes a byte.
val binary_of_octets : string -> t
binary_of_octets s
is the OCaml string of bytes s
as a JavaScript binary string in which each 16-bit character encodes a byte.
val to_int : ?base:int -> t -> int option
to_int s
is the integer resulting from parsing s
as a number in base base
(guessed by default). The function uses Number.parseInt
and maps Float.nan
results to None
.
val of_int : ?base:int -> int -> t
of_int ~base i
formats i
as a number in base base
(defaults to 10
). Conversion is performed via Number.toString
.
val to_float : t -> float
to_float s
is the floating point number resulting from parsing s
. This always succeeds and returns Float.nan
on unparseable inputs. The function uses Number.parseFloat
.
val of_float : ?frac:int -> float -> t
of_float ~frac n
formats n
with frac
fixed fractional digits (or as needed if unspecified). This function uses Number.toFixed
if f
is specified and Number.toString
otherwise.