Module Jstr

JavaScript strings

type t

The type for JavaScript UTF-16 encoded strings.

val v : string -> t

v s is the UTF-8 encoded OCaml string s as a JavaScript string.

val length : t -> int

length s is the length of s.

val get : t -> int -> Stdlib.Uchar.t

get s i is the Unicode character at position i in s. If this happens to be a lone low or any high surrogate surrogate, Uchar.rep is returned. Raises Invalid_argument if i is out of bounds.

val get_jstr : t -> int -> t

get_jstr t i is like get but with the character as a string.

Constants

val empty : t

empty is an empty string.

val sp : t

sp is Jstr.v " ".

val nl : t

nl is Jstr.v "\n".

Assembling

val append : t -> t -> t

append s0 s1 appends s1 to s0.

val (+) : t -> t -> t

s0 + s1 is append s0 s1.

val concat : ?sep:t -> t list -> t

concat ?sep ss is the concatenates the list of strings ss inserting sep between each of them (defaults to empty).

val pad_start : ?pad:t -> int -> t -> t

pad_start ~pad n s is s with pad strings prepended to s until the length of the result is n or s if length s >= n. The first prepended pad may be truncated to satisfy the constraint. pad defaults to sp.

Warning. Since length is neither the number of Unicode characters of s nor its number of grapheme clusters, if you are using this for visual layout, it will fail in many cases. At least consider normalizing s to `NFC before.

val pad_end : ?pad:t -> int -> t -> t

pad_end ~pad n s is s with pad strings appended to s until the length of the result is n or s if length s >= n. The last appended pad may be truncated to satisfy the constraint. pad defaults to sp.

Warning. Since length is neither the number of Unicode characters of s nor its number of grapheme clusters, if you are using this for visual layout, it will fail in many cases. At least consider normalizing s to `NFC before.

val repeat : int -> t -> t

repeat n s is s repeated n times. Raises Jv.Error if n is negative.

Finding

val find_sub : ?start:int -> sub:t -> t -> int option

find_sub ~start ~sub s is the start index (if any) of the first occurence of sub in s at or after start .

val find_last_sub : ?before:int -> sub:t -> t -> int option

find_last_sub ~before ~sub s is the start index (if any) of the last occurence of sub in s before before (defaults to length s).

Breaking

val slice : ?start:int -> ?stop:int -> t -> t

slice ~start ~stop s is the string s.start, s.start+1, ... s.stop - 1. start defaults to 0 and stop to length s.

If start or stop are negative they are subtracted from length s. This means that -1 denotes the last character of the string.

val sub : ?start:int -> ?len:int -> t -> t

sub ~start ~len s is the string s.start, ... s.start + len - 1. start default to 0 and len to length s - start.

If start is negative it is subtracted from length s. This means that -1 denotes the last character of the string. If len is negative it is treated as 0.

val cuts : sep:t -> t -> t list

cuts sep s is the list of all (possibly empty) substrings of s that are delimited by matches of the non empty separator string sep.

Traversing and transforming

val fold_uchars : (Stdlib.Uchar.t -> 'a -> 'a) -> t -> 'a -> 'a

fold_uchars f acc s folds f over the Unicode characters of s starting with acc. Decoding errors (that is unpaired UTF-16 surrogates) are reported as Uchar.rep.

val fold_jstr_uchars : (t -> 'a -> 'a) -> t -> 'a -> 'a

fold_jstr_uchars is like fold_uchars but the characters are given as strings.

val trim : t -> t

trim s is s without whitespace from the beginning and end of the string.

Normalization

For more information on normalization consult a short introduction, the UAX #15 Unicode Normalization Forms and normalization charts.

type normalization = [
  1. | `NFD
  2. | `NFC
  3. | `NFKD
  4. | `NFKC
]

The type for normalization forms.

val normalized : normalization -> t -> t

normalized nf t is t normalized to nf.

Case mapping

For more information about case see the Unicode case mapping FAQ and the case mapping charts. Note that these algorithms are insensitive to language and context and may produce sub-par results for some users.

val lowercased : t -> t

lowercased s is s lowercased according to Unicode's default case conversion.

val uppercased : t -> t

lowercased s is s uppercased according to Unicode's default case conversion.

Predicates and comparisons

val is_empty : t -> bool

is_empty s is true iff s is an empty string.

val starts_with : prefix:t -> t -> bool

starts_with ~prefix s is true iff s starts with prefix (as per equal).

val includes : affix:t -> t -> bool

includes ~suffix s is true iff s includes affix (as per equal).

val ends_with : suffix:t -> t -> bool

ends_with ~suffix s is true iff s ends with suffix (as per equal).

val equal : t -> t -> bool

equal s0 s1 is true iff s0 and s1 are equal. Warning. Unless s0 and s1 are known to be in a particular normal form the test is textually meaningless.

val compare : t -> t -> int

compare s0 s1 is a total order on strings compatible with equal. Warning. The comparison is textually meaningless.

Conversions

val of_uchar : Stdlib.Uchar.t -> t

of_uchar u is a string made of u.

val of_char : char -> t

of_char c is a string made of c.

val to_string : t -> string

to_string s is s as an UTF-8 encoded OCaml string.

val of_string : string -> t

of_string s is the UTF-8 encoded OCaml string s as a JavaScript string.

val binary_to_octets : t -> string

binary_to_octets s is the JavaScript binary string s as an OCaml string of bytes. In s each 16-bit JavaScript character encodes a byte.

val binary_of_octets : string -> t

binary_of_octets s is the OCaml string of bytes s as a JavaScript binary string in which each 16-bit character encodes a byte.

val to_int : ?base:int -> t -> int option

to_int s is the integer resulting from parsing s as a number in base base (guessed by default). The function uses Number.parseInt and maps Float.nan results to None.

val of_int : ?base:int -> int -> t

of_int ~base i formats i as a number in base base (defaults to 10). Conversion is performed via Number.toString.

val to_float : t -> float

to_float s is the floating point number resulting from parsing s. This always succeeds and returns Float.nan on unparseable inputs. The function uses Number.parseFloat.

val of_float : ?frac:int -> float -> t

of_float ~frac n formats n with frac fixed fractional digits (or as needed if unspecified). This function uses Number.toFixed if f is specified and Number.toString otherwise.