Module Astring.String.Sub

module Sub: sig .. end

Substrings.

A substring defines a possibly empty subsequence of bytes in a base string.

The positions of a string s of length l are the slits found before each byte and after the last byte of the string. They are labelled from left to right by increasing number in the range [0;l].

positions  0   1   2   3   4    l-1    l
           +---+---+---+---+     +-----+
  indices  | 0 | 1 | 2 | 3 | ... | l-1 |
           +---+---+---+---+     +-----+

The ith byte index is between positions i and i+1.

Formally we define a substring of s as being a subsequence of bytes defined by a start and a stop position. The former is always smaller or equal to the latter. When both positions are equal the substring is empty. Note that for a given base string there are as many empty substrings as there are positions in the string.

Like in strings, we index the bytes of a substring using zero-based indices.

See how to use substrings to parse data.


Substrings

type t = Astring.String.sub 

The type for substrings.

val empty : Astring.String.sub

empty is the empty substring of the empty string Astring.String.empty.

val v : ?start:int -> ?stop:int -> string -> Astring.String.sub

v ~start ~stop s is the substring of s that starts at position start (defaults to 0) and stops at position stop (defaults to String.length s).

val start_pos : Astring.String.sub -> int

start_pos s is s's start position in the base string.

val stop_pos : Astring.String.sub -> int

stop_pos s is s's stop position in the base string.

val base_string : Astring.String.sub -> string

base_string s is s's base string.

val length : Astring.String.sub -> int

length s is the number of bytes in s.

val get : Astring.String.sub -> int -> char

get s i is the byte of s at its zero-based index i.

val get_byte : Astring.String.sub -> int -> int

get_byte s i is Char.to_int (get s i).

val head : ?rev:bool -> Astring.String.sub -> char option

head s is Some (get s h) with h = 0 if rev = false (default) or h = length s - 1 if rev = true. None is returned if s is empty.

val get_head : ?rev:bool -> Astring.String.sub -> char

get_head s is like Astring.String.Sub.head but

val of_string : string -> Astring.String.sub

of_string s is v s

val to_string : Astring.String.sub -> string

to_string s is the bytes of s as a string.

val rebase : Astring.String.sub -> Astring.String.sub

rebase s is v (to_string s). This puts s on a base string made solely of its bytes.

val hash : Astring.String.sub -> int

hash s is Hashtbl.hash s.

Stretching substrings

See the graphical guide.

val start : Astring.String.sub -> Astring.String.sub

start s is the empty substring at the start position of s.

val stop : Astring.String.sub -> Astring.String.sub

stop s is the empty substring at the stop position of s.

val base : Astring.String.sub -> Astring.String.sub

base s is a substring that spans the whole base string of s.

val tail : ?rev:bool -> Astring.String.sub -> Astring.String.sub

tail s is s without its first (rev is false, default) or last (rev is true) byte or s if it is empty.

val extend : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub

extend ~rev ~max ~sat s extends s by at most max consecutive sat satisfiying bytes of the base string located after stop s (rev is false, default) or before start s (rev is true). If max is unspecified the extension is limited by the extents of the base string of s. sat defaults to fun _ -> true.

val reduce : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub

reduce ~rev ~max ~sat s reduces s by at most max consecutive sat satisfying bytes of s located before stop s (rev is false, default) or after start s (rev is true). If max is unspecified the reduction is limited by the extents of the substring s. sat defaults to fun _ -> true.

val extent : Astring.String.sub -> Astring.String.sub -> Astring.String.sub

extent s s' is the smallest substring that includes all the positions of s and s'.

val overlap : Astring.String.sub -> Astring.String.sub -> Astring.String.sub option

overlap s s' is the smallest substring that includes all the positions common to s and s' or None if there are no such positions. Note that the overlap substring may be empty.

Appending substrings

val append : Astring.String.sub -> Astring.String.sub -> Astring.String.sub

append s s' is like Astring.String.append. The substrings can be on different bases and the result is on a base string that holds exactly the appended bytes.

val concat : ?sep:Astring.String.sub -> Astring.String.sub list -> Astring.String.sub

concat ~sep ss is like Astring.String.concat. The substrings can all be on different bases and the result is on a base string that holds exactly the concatenated bytes.

Predicates

val is_empty : Astring.String.sub -> bool

is_empty s is length s = 0.

val is_prefix : affix:Astring.String.sub -> Astring.String.sub -> bool

is_prefix is like Astring.String.is_prefix. Only bytes are compared, affix can be on a different base string.

val is_infix : affix:Astring.String.sub -> Astring.String.sub -> bool

is_infix is like Astring.String.is_infix. Only bytes are compared, affix can be on a different base string.

val is_suffix : affix:Astring.String.sub -> Astring.String.sub -> bool

is_suffix is like Astring.String.is_suffix. Only bytes are compared, affix can be on a different base string.

val for_all : (char -> bool) -> Astring.String.sub -> bool

for_all is like Astring.String.for_all on the substring.

val exists : (char -> bool) -> Astring.String.sub -> bool

exists is like Astring.String.exists on the substring.

val same_base : Astring.String.sub -> Astring.String.sub -> bool

same_base s s' is true iff the substrings s and s' have the same base string according to physical equality.

val equal_bytes : Astring.String.sub -> Astring.String.sub -> bool

equal_bytes s s' is true iff the substrings s and s' have exactly the same bytes. The substrings can be on a different base string.

val compare_bytes : Astring.String.sub -> Astring.String.sub -> int

compare_bytes s s' compares the bytes of s and s' in lexicographical order. The substrings can be on a different base string.

val equal : Astring.String.sub -> Astring.String.sub -> bool

equal s s' is true iff s and s' have the same positions.

val compare : Astring.String.sub -> Astring.String.sub -> int

compare s s' compares the positions of s and s' in lexicographical order.

Extracting substrings

Extracted substrings are always on the same base string as the substring s acted upon.

val with_range : ?first:int -> ?len:int -> Astring.String.sub -> Astring.String.sub

with_range is like Astring.String.sub_with_range. The indices are the substring's zero-based ones, not those in the base string.

val with_index_range : ?first:int -> ?last:int -> Astring.String.sub -> Astring.String.sub

with_index_range is like Astring.String.sub_with_index_range. The indices are the substring's zero-based ones, not those in the base string.

val trim : ?drop:(char -> bool) -> Astring.String.sub -> Astring.String.sub

trim is like Astring.String.trim. If all bytes are dropped returns an empty string located in the middle of the argument.

val span : ?rev:bool ->
?min:int ->
?max:int ->
?sat:(char -> bool) ->
Astring.String.sub -> Astring.String.sub * Astring.String.sub

span is like Astring.String.span. For a substring s a left empty span is start s and a right empty span is stop s.

val take : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub

take is like Astring.String.take.

val drop : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub

drop is like Astring.String.drop.

val cut : ?rev:bool ->
sep:Astring.String.sub ->
Astring.String.sub -> (Astring.String.sub * Astring.String.sub) option

cut is like Astring.String.cut. sep can be on a different base string

val cuts : ?rev:bool ->
?empty:bool ->
sep:Astring.String.sub -> Astring.String.sub -> Astring.String.sub list

cuts is like Astring.String.cuts. sep can be on a different base string

val fields : ?empty:bool ->
?is_sep:(char -> bool) -> Astring.String.sub -> Astring.String.sub list

fields is like Astring.String.fields.

Traversing substrings

val find : ?rev:bool ->
(char -> bool) -> Astring.String.sub -> Astring.String.sub option

find ~rev sat s is the substring of s (if any) that spans the first byte that satisfies sat in s after position start s (rev is false, default) or before stop s (rev is true). None is returned if there is no matching byte in s.

val find_sub : ?rev:bool ->
sub:Astring.String.sub -> Astring.String.sub -> Astring.String.sub option

find_sub ~rev ~sub s is the substring of s (if any) that spans the first match of sub in s after position start s (rev is false, default) or before stop s (rev is true). Only bytes are compared and sub can be on a different base string. None is returned if there is no match of sub in s.

val filter : (char -> bool) -> Astring.String.sub -> Astring.String.sub

filter sat s is like Astring.String.filter. The result is on a base string that holds only the filtered bytes.

val filter_map : (char -> char option) -> Astring.String.sub -> Astring.String.sub

filter_map f s is like Astring.String.filter_map. The result is on a base string that holds only the filtered bytes.

val map : (char -> char) -> Astring.String.sub -> Astring.String.sub

map is like Astring.String.map. The result is on a base string that holds only the mapped bytes.

val mapi : (int -> char -> char) -> Astring.String.sub -> Astring.String.sub

mapi is like Astring.String.mapi. The result is on a base string that holds only the mapped bytes. The indices are the substring's zero-based ones, not those in the base string.

val fold_left : ('a -> char -> 'a) -> 'a -> Astring.String.sub -> 'a

fold_left is like Astring.String.fold_left.

val fold_right : (char -> 'a -> 'a) -> Astring.String.sub -> 'a -> 'a

fold_right is like Astring.String.fold_right.

val iter : (char -> unit) -> Astring.String.sub -> unit

iter is like Astring.String.iter.

val iteri : (int -> char -> unit) -> Astring.String.sub -> unit

iteri is like Astring.String.iteri. The indices are the substring's zero-based ones, not those in the base string.

Pretty printing

val pp : Stdlib.Format.formatter -> Astring.String.sub -> unit

pp ppf s prints s's bytes on ppf.

val dump : Stdlib.Format.formatter -> Astring.String.sub -> unit

dump ppf s prints s as a syntactically valid OCaml string on ppf using Astring.String.Ascii.escape_string.

val dump_raw : Stdlib.Format.formatter -> Astring.String.sub -> unit

dump_raw ppf s prints an unspecified raw internal representation of s on ppf.

OCaml base type conversions

val of_char : char -> Astring.String.sub

of_char c is a string that contains the byte c.

val to_char : Astring.String.sub -> char option

to_char s is the single byte in s or None if there is no byte or more than one in s.

val of_bool : bool -> Astring.String.sub

of_bool b is a string representation for b. Relies on Stdlib.string_of_bool.

val to_bool : Astring.String.sub -> bool option

to_bool s is a bool from s, if any. Relies on Stdlib.bool_of_string.

val of_int : int -> Astring.String.sub

of_int i is a string representation for i. Relies on Stdlib.string_of_int.

val to_int : Astring.String.sub -> int option

to_int is an int from s, if any. Relies on Stdlib.int_of_string.

val of_nativeint : nativeint -> Astring.String.sub

of_nativeint i is a string representation for i. Relies on Nativeint.of_string.

val to_nativeint : Astring.String.sub -> nativeint option

to_nativeint is an nativeint from s, if any. Relies on Nativeint.to_string.

val of_int32 : int32 -> Astring.String.sub

of_int32 i is a string representation for i. Relies on Int32.of_string.

val to_int32 : Astring.String.sub -> int32 option

to_int32 is an int32 from s, if any. Relies on Int32.to_string.

val of_int64 : int64 -> Astring.String.sub

of_int64 i is a string representation for i. Relies on Int64.of_string.

val to_int64 : Astring.String.sub -> int64 option

to_int64 is an int64 from s, if any. Relies on Int64.to_string.

val of_float : float -> Astring.String.sub

of_float f is a string representation for f. Relies on Stdlib.string_of_float.

val to_float : Astring.String.sub -> float option

to_float s is a float from s, if any. Relies on Stdlib.float_of_string.

Substring stretching graphical guide

+---+---+---+---+---+---+---+---+---+---+---+
| R | e | v | o | l | t |   | n | o | w | ! |
+---+---+---+---+---+---+---+---+---+---+---+
        |---------------|                      a
        |                                      start a
                        |                      stop a
            |-----------|                      tail a
        |-----------|                          tail ~rev:true a
        |-----------------------------------|  extend a
|-----------------------|                      extend ~rev:true a
|-------------------------------------------|  base a
|-----------|                                  b
|                                              start b
            |                                  stop b
    |-------|                                  tail b
|-------|                                      tail ~rev:true b
|-------------------------------------------|  extend b
|-----------|                                  extend ~rev:true b
|-------------------------------------------|  base b
|-----------------------|                      extent a b
        |---|                                  overlap a b
                            |                  c
                            |                  start c
                            |                  stop c
                            |                  tail c
                            |                  tail ~rev:true c
                            |---------------|  extend c
|---------------------------|                  extend ~rev:true c
|-------------------------------------------|  base c
        |-------------------|                  extent a c
                                         None  overlap a c
                            |---------------|  d
                            |                  start d
                                            |  stop d
                                |-----------|  tail d
                            |-----------|      tail ~rev:true d
                            |---------------|  extend d
|-------------------------------------------|  extend ~rev:true d
|-------------------------------------------|  base d
                            |---------------|  extent d c
                            |                  overlap d c