Module String.Ascii

US-ASCII string support.

References.

Predicates

val is_valid : string -> bool

is_valid s is true iff only for all indices i of s, s.[i] is an US-ASCII character, i.e. a byte in the range [0x00;0x7F].

Casing transforms

The following functions act only on US-ASCII code points that is on bytes in range [0x00;0x7F], leaving any other byte intact. The functions can be safely used on UTF-8 encoded strings; they will of course only deal with US-ASCII casings.

val uppercase : string -> string

uppercase s is s with US-ASCII characters 'a' to 'z' mapped to 'A' to 'Z'.

val lowercase : string -> string

lowercase s is s with US-ASCII characters 'A' to 'Z' mapped to 'a' to 'z'.

val capitalize : string -> string

capitalize s is like uppercase but performs the map only on s.[0].

val uncapitalize : string -> string

uncapitalize s is like lowercase but performs the map only on s.[0].

Escaping to printable US-ASCII

val escape : string -> string

escape s is s with:

  • Any '\\' (0x5C) escaped to the sequence "\\\\" (0x5C,0x5C).
  • Any byte in the ranges [0x00;0x1F] and [0x7F;0xFF] escaped by an hexadecimal "\xHH" escape with H a capital hexadecimal number. These bytes are the US-ASCII control characters and non US-ASCII bytes.
  • Any other byte is left unchanged.
val unescape : string -> string option

unescape s unescapes what escape did. The letters of hex escapes can be upper, lower or mixed case, and any two letter hex escape is decoded to its corresponding byte. Any other escape not defined by escape or truncated escape makes the function return None.

The invariant unescape (escape s) = Some s holds.

val escape_string : string -> string

escape_string s is like escape except it escapes s according to OCaml's lexical conventions for strings with:

  • Any '\b' (0x08) escaped to the sequence "\\b" (0x5C,0x62).
  • Any '\t' (0x09) escaped to the sequence "\\t" (0x5C,0x74).
  • Any '\n' (0x0A) escaped to the sequence "\\n" (0x5C,0x6E).
  • Any '\r' (0x0D) escaped to the sequence "\\r" (0x5C,0x72).
  • Any '\"' (0x22) escaped to the sequence "\\\"" (0x5C,0x22).
  • Any other byte follows the rules of escape
val unescape_string : string -> string option

unescape_string is to escape_string what unescape is to escape and also additionally unescapes the sequence "\\'" (0x5C,0x27) to "'" (0x27).