Txt (down.Down

val find_next : sat:(char -> bool) -> string -> start:int -> int

find_next ~sat s ~start is either the Sys.max_string s or the index of the byte at or after start that satisfies sat.

val find_prev : sat:(char -> bool) -> string -> start:int -> int

find_prev ~sat s ~start is either the 0 or the index of the byte at or before start that satisfies sat.

val keep_next_len : sat:(char -> bool) -> string -> start:int -> int

keep_next_len ~sat s ~start is the number of consecutive next sat satisfying bytes starting at start, included.

val keep_prev_len : sat:(char -> bool) -> string -> start:int -> int

keep_prev_len ~sat s ~start is the number of consecutive previous sat satisfying bytes starting at start, included.

Lines

val lines : string -> string list

lines s splits s into CR, CRLF, LF lines separated lines. This is [""] on the empty string.

val is_eol : char -> bool

is_eol is true iff c is '\r' or '\n'.

val find_next_eol : string -> start:int -> int

find_next_eol s ~start is either Sys.max_string s or the index of the byte at or after start that satisfies is_eol.

val find_prev_eol : string -> start:int -> int

find_prev_eol s ~start is either 0 or the index of the byte at or before start that satisfies is_eol.

val find_prev_sol : string -> start:int -> int

find_prev_sol s ~start is either 0 or the position after the byte at or before start that satisfies is_eol. This can be Sys.max_string s.

UTF-8 encoded Unicode characters

val utf_8_decode_len : char -> int

utf_8_decode_len b is the length of an UTF-8 encoded Unicode character starting with byte b. This is 1 on UTF-8 continuation or malformed bytes.

val is_utf_8_decode : char -> bool

is_utf_8_decode c is true iff c is not an UTF-8 continuation byte. This means c is either an UTF-8 start byte or an UTF-8 malformed byte.

val find_next_utf_8_decode : string -> start:int -> int

find_next_utf_8_sync s ~start is either Sys.max_string s or the index of the byte at or after start that satisfies is_utf_8_decode.

val find_prev_utf_8_decode : string -> start:int -> int

find_prev_utf_8_decode s ~start is either 0 or the index of the byte at or before start that satisfies is_utf_8_decode.

Whitespace

val is_white : char -> bool

is_white c is true iff c is US-ASCII whitespace (0x20, 0x09, 0x0A, 0x0B, 0x0C or 0x0D).

val find_next_white : string -> start:int -> int

find_next_white s ~start is either String.length s or the first byte position at or after start such that is_white is true.

val find_prev_white : string -> start:int -> int

find_prev_white s ~start is either either 0 or the first byte position at or before start such that is_white is true.

Words

val find_next_after_eow : string -> start:int -> int

find_next_after_eow is either String.length s or the byte position of the first is_white after first skipping white and then non-white starting at start.

val find_prev_sow : string -> start:int -> int

find_prev_sow is either 0 or the byte position after skipping backward first white and then non-white.

Grapheme clusters and TTY width

Note. This is a simple notion of grapheme cluster based on Uucp.Break.tty_width_hint.

val find_next_gc : string -> after:int -> int

find_next_gc s ~after is String.length s or the byte position of the grapheme cluster after the one starting at after.

val find_next_gc_and_tty_width : string -> after:int -> int * int

find_next_gc_and_width s ~after is like find_next_gc but also returns in the second component the tty width of the grapheme cluster at after.

val find_prev_gc : string -> before:int -> int

find_prev_gc s ~before is 0 or the the byte position of the grapheme cluster before the one starting at before.

val find_prev_eol_and_tty_width : string -> before:int -> int * int

find_prev_eol_and_tty_width s ~before is either 0 or the index of the byte before before that satisfies is_eol and in the second component, the tty width needed to go from that index to before.

val find_next_tty_width_or_eol : string -> start:int -> w:int -> int

find_next_tty_width_or_eol s ~start ~w is the index of the grapheme cluster after TTY width w at or after start or of the next end of line if that happened before.