Module B0_std.Url

URLs

type t = string

The type for URLs.

Kinds

type relative_kind = [
  1. | `Scheme
  2. | `Abs_path
  3. | `Rel_path
  4. | `Empty
]

The type for kinds of relative references. Represents this alternation.

type kind = [
  1. | `Abs
  2. | `Rel of relative_kind
]

The type for kind of URLs. Respresents this this alternation

val kind : string -> kind

kind u determines the kind of u. It decides that u is absolute if u starts with has a scheme.

Components

val scheme : string -> string option

scheme u tries to exract a URL scheme from u.

val authority : string -> string option

authority u tries to extract a URL authority (HOST:PORT) part from u.

val path_and_query : string -> string option

path_and_query u tries to extract a URL path and query part from u.

Scraping

val list_of_text_scrape : string -> string list

list_of_text_scrape s roughly finds URLs and relative or absolute paths in s by looking in order:

  1. For the next href or src substring then tries to parses the content of an HTML attribute. This may result in relative or absolute paths.
  2. For next http substrings in s and then delimits an URL depending on the previous characters and checks that the delimited URL starts with http:// or https://.