Module Astring.String.Sub

module Sub: sig .. end
Substrings.

A substring defines a possibly empty subsequence of bytes in a base string.

The positions of a string s of length l are the slits found before each byte and after the last byte of the string. They are labelled from left to right by increasing number in the range [0;l].

positions  0   1   2   3   4    l-1    l
           +---+---+---+---+     +-----+
  indices  | 0 | 1 | 2 | 3 | ... | l-1 |
           +---+---+---+---+     +-----+

The ith byte index is between positions i and i+1.

Formally we define a substring of s as being a subsequence of bytes defined by a start and a stop position. The former is always smaller or equal to the latter. When both positions are equal the substring is empty. Note that for a given base string there are as many empty substrings as there are positions in the string.

Like in strings, we index the bytes of a substring using zero-based indices.

See how to use substrings to parse data.



Substrings


type t = Astring.String.sub 
The type for substrings.
val empty : Astring.String.sub
empty is the empty substring of the empty string Astring.String.empty.
val v : ?start:int -> ?stop:int -> string -> Astring.String.sub
v ~start ~stop s is the substring of s that starts at position start (defaults to 0) and stops at position stop (defaults to String.length s).
Raises Invalid_argument if start or stop are not positions of s or if stop < start.
val start_pos : Astring.String.sub -> int
start_pos s is s's start position in the base string.
val stop_pos : Astring.String.sub -> int
stop_pos s is s's stop position in the base string.
val base_string : Astring.String.sub -> string
base_string s is s's base string.
val length : Astring.String.sub -> int
length s is the number of bytes in s.
val get : Astring.String.sub -> int -> char
get s i is the byte of s at its zero-based index i.
Raises Invalid_argument if i is not an index of s.
val get_byte : Astring.String.sub -> int -> int
get_byte s i is Char.to_int (get s i).
val head : ?rev:bool -> Astring.String.sub -> char option
head s is Some (get s h) with h = 0 if rev = false (default) or h = length s - 1 if rev = true. None is returned if s is empty.
val get_head : ?rev:bool -> Astring.String.sub -> char
get_head s is like Astring.String.Sub.head but
Raises Invalid_argument if s is empty.
val of_string : string -> Astring.String.sub
of_string s is v s
val to_string : Astring.String.sub -> string
to_string s is the bytes of s as a string.
val rebase : Astring.String.sub -> Astring.String.sub
rebase s is v (to_string s). This puts s on a base string made solely of its bytes.
val hash : Astring.String.sub -> int
hash s is Hashtbl.hash s.

Stretching substrings

See the graphical guide.

val start : Astring.String.sub -> Astring.String.sub
start s is the empty substring at the start position of s.
val stop : Astring.String.sub -> Astring.String.sub
stop s is the empty substring at the stop position of s.
val base : Astring.String.sub -> Astring.String.sub
base s is a substring that spans the whole base string of s.
val tail : ?rev:bool -> Astring.String.sub -> Astring.String.sub
tail s is s without its first (rev is false, default) or last (rev is true) byte or s if it is empty.
val extend : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
extend ~rev ~max ~sat s extends s by at most max consecutive sat satisfiying bytes of the base string located after stop s (rev is false, default) or before start s (rev is true). If max is unspecified the extension is limited by the extents of the base string of s. sat defaults to fun _ -> true.
Raises Invalid_argument if max is negative.
val reduce : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
reduce ~rev ~max ~sat s reduces s by at most max consecutive sat satisfying bytes of s located before stop s (rev is false, default) or after start s (rev is true). If max is unspecified the reduction is limited by the extents of the substring s. sat defaults to fun _ -> true.
Raises Invalid_argument if max is negative.
val extent : Astring.String.sub -> Astring.String.sub -> Astring.String.sub
extent s s' is the smallest substring that includes all the positions of s and s'.
Raises Invalid_argument if s and s' are not on the same base string according to physical equality.
val overlap : Astring.String.sub -> Astring.String.sub -> Astring.String.sub option
overlap s s' is the smallest substring that includes all the positions common to s and s' or None if there are no such positions. Note that the overlap substring may be empty.
Raises Invalid_argument if s and s' are not on the same base string according to physical equality.

Appending substrings


val append : Astring.String.sub -> Astring.String.sub -> Astring.String.sub
append s s' is like Astring.String.append. The substrings can be on different bases and the result is on a base string that holds exactly the appended bytes.
val concat : ?sep:Astring.String.sub -> Astring.String.sub list -> Astring.String.sub
concat ~sep ss is like Astring.String.concat. The substrings can all be on different bases and the result is on a base string that holds exactly the concatenated bytes.

Predicates


val is_empty : Astring.String.sub -> bool
is_empty s is length s = 0.
val is_prefix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_prefix is like Astring.String.is_prefix. Only bytes are compared, affix can be on a different base string.
val is_infix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_infix is like Astring.String.is_infix. Only bytes are compared, affix can be on a different base string.
val is_suffix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_suffix is like Astring.String.is_suffix. Only bytes are compared, affix can be on a different base string.
val for_all : (char -> bool) -> Astring.String.sub -> bool
for_all is like Astring.String.for_all on the substring.
val exists : (char -> bool) -> Astring.String.sub -> bool
exists is like Astring.String.exists on the substring.
val same_base : Astring.String.sub -> Astring.String.sub -> bool
same_base s s' is true iff the substrings s and s' have the same base string according to physical equality.
val equal_bytes : Astring.String.sub -> Astring.String.sub -> bool
equal_bytes s s' is true iff the substrings s and s' have exactly the same bytes. The substrings can be on a different base string.
val compare_bytes : Astring.String.sub -> Astring.String.sub -> int
compare_bytes s s' compares the bytes of s and s' in lexicographical order. The substrings can be on a different base string.
val equal : Astring.String.sub -> Astring.String.sub -> bool
equal s s' is true iff s and s' have the same positions.
Raises Invalid_argument if s and s' are not on the same base string according to physical equality.
val compare : Astring.String.sub -> Astring.String.sub -> int
compare s s' compares the positions of s and s' in lexicographical order.
Raises Invalid_argument if s and s' are not on the same base string according to physical equality.

Extracting substrings

Extracted substrings are always on the same base string as the substring s acted upon.

val with_range : ?first:int -> ?len:int -> Astring.String.sub -> Astring.String.sub
with_range is like Astring.String.sub_with_range. The indices are the substring's zero-based ones, not those in the base string.
val with_index_range : ?first:int -> ?last:int -> Astring.String.sub -> Astring.String.sub
with_index_range is like Astring.String.sub_with_index_range. The indices are the substring's zero-based ones, not those in the base string.
val trim : ?drop:(char -> bool) -> Astring.String.sub -> Astring.String.sub
trim is like Astring.String.trim. If all bytes are dropped returns an empty string located in the middle of the argument.
val span : ?rev:bool ->
?min:int ->
?max:int ->
?sat:(char -> bool) ->
Astring.String.sub -> Astring.String.sub * Astring.String.sub
span is like Astring.String.span. For a substring s a left empty span is start s and a right empty span is stop s.
val take : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
take is like Astring.String.take.
val drop : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
drop is like Astring.String.drop.
val cut : ?rev:bool ->
sep:Astring.String.sub ->
Astring.String.sub -> (Astring.String.sub * Astring.String.sub) option
cut is like Astring.String.cut. sep can be on a different base string
val cuts : ?rev:bool ->
?empty:bool ->
sep:Astring.String.sub -> Astring.String.sub -> Astring.String.sub list
cuts is like Astring.String.cuts. sep can be on a different base string
val fields : ?empty:bool ->
?is_sep:(char -> bool) -> Astring.String.sub -> Astring.String.sub list
fields is like Astring.String.fields.

Traversing substrings


val find : ?rev:bool ->
(char -> bool) -> Astring.String.sub -> Astring.String.sub option
find ~rev sat s is the substring of s (if any) that spans the first byte that satisfies sat in s after position start s (rev is false, default) or before stop s (rev is true). None is returned if there is no matching byte in s.
val find_sub : ?rev:bool ->
sub:Astring.String.sub -> Astring.String.sub -> Astring.String.sub option
find_sub ~rev ~sub s is the substring of s (if any) that spans the first match of sub in s after position start s (rev is false, defaults) or before stop s (rev is false). Only bytes are compared and sub can be on a different base string. None is returned if there is no match of sub in s.
val filter : (char -> bool) -> Astring.String.sub -> Astring.String.sub
filter sat s is like Astring.String.filter. The result is on a base string that holds only the filtered bytes.
val filter_map : (char -> char option) -> Astring.String.sub -> Astring.String.sub
filter_map f s is like Astring.String.filter_map. The result is on a base string that holds only the filtered bytes.
val map : (char -> char) -> Astring.String.sub -> Astring.String.sub
map is like Astring.String.map. The result is on a base string that holds only the mapped bytes.
val mapi : (int -> char -> char) -> Astring.String.sub -> Astring.String.sub
mapi is like Astring.String.mapi. The result is on a base string that holds only the mapped bytes. The indices are the substring's zero-based ones, not those in the base string.
val fold_left : ('a -> char -> 'a) -> 'a -> Astring.String.sub -> 'a
fold_left is like Astring.String.fold_left.
val fold_right : (char -> 'a -> 'a) -> Astring.String.sub -> 'a -> 'a
fold_right is like Astring.String.fold_right.
val iter : (char -> unit) -> Astring.String.sub -> unit
iter is like Astring.String.iter.
val iteri : (int -> char -> unit) -> Astring.String.sub -> unit
iteri is like Astring.String.iteri. The indices are the substring's zero-based ones, not those in the base string.

Pretty printing


val pp : Format.formatter -> Astring.String.sub -> unit
pp ppf s prints s's bytes on ppf.
val dump : Format.formatter -> Astring.String.sub -> unit
dump ppf s prints s as a syntactically valid OCaml string on ppf using Astring.String.Ascii.escape_string.
val dump_raw : Format.formatter -> Astring.String.sub -> unit
dump_raw ppf s prints an unspecified raw internal representation of s on ppf.

OCaml base type conversions


val of_char : char -> Astring.String.sub
of_char c is a string that contains the byte c.
val to_char : Astring.String.sub -> char option
to_char s is the single byte in s or None if there is no byte or more than one in s.
val of_bool : bool -> Astring.String.sub
of_bool b is a string representation for b. Relies on Pervasives.string_of_bool.
val to_bool : Astring.String.sub -> bool option
to_bool s is a bool from s, if any. Relies on Pervasives.bool_of_string.
val of_int : int -> Astring.String.sub
of_int i is a string representation for i. Relies on Pervasives.string_of_int.
val to_int : Astring.String.sub -> int option
to_int is an int from s, if any. Relies on Pervasives.int_of_string.
val of_nativeint : nativeint -> Astring.String.sub
of_nativeint i is a string representation for i. Relies on Nativeint.of_string.
val to_nativeint : Astring.String.sub -> nativeint option
to_nativeint is an nativeint from s, if any. Relies on Nativeint.to_string.
val of_int32 : int32 -> Astring.String.sub
of_int32 i is a string representation for i. Relies on Int32.of_string.
val to_int32 : Astring.String.sub -> int32 option
to_int32 is an int32 from s, if any. Relies on Int32.to_string.
val of_int64 : int64 -> Astring.String.sub
of_int64 i is a string representation for i. Relies on Int64.of_string.
val to_int64 : Astring.String.sub -> int64 option
to_int64 is an int64 from s, if any. Relies on Int64.to_string.
val of_float : float -> Astring.String.sub
of_float f is a string representation for f. Relies on Pervasives.string_of_float.
val to_float : Astring.String.sub -> float option
to_float s is a float from s, if any. Relies on Pervasives.float_of_string.

Substring stretching graphical guide

+---+---+---+---+---+---+---+---+---+---+---+
| R | e | v | o | l | t |   | n | o | w | ! |
+---+---+---+---+---+---+---+---+---+---+---+
        |---------------|                      a
        |                                      start a
                        |                      stop a
            |-----------|                      tail a
        |-----------|                          tail ~rev:true a
        |-----------------------------------|  extend a
|-----------------------|                      extend ~rev:true a
|-------------------------------------------|  base a
|-----------|                                  b
|                                              start b
            |                                  stop b
    |-------|                                  tail b
|-------|                                      tail ~rev:true b
|-------------------------------------------|  extend b
|-----------|                                  extend ~rev:true b
|-------------------------------------------|  base b
|-----------------------|                      extent a b
        |---|                                  overlap a b
                            |                  c
                            |                  start c
                            |                  stop c
                            |                  tail c
                            |                  tail ~rev:true c
                            |---------------|  extend c
|---------------------------|                  extend ~rev:true c
|-------------------------------------------|  base c
        |-------------------|                  extent a c
                                         None  overlap a c
                            |---------------|  d
                            |                  start d
                                            |  stop d
                                |-----------|  tail d
                            |-----------|      tail ~rev:true d
                            |---------------|  extend d
|-------------------------------------------|  extend ~rev:true d
|-------------------------------------------|  base d
                            |---------------|  extent d c
                            |                  overlap d c