Module Cmarkit_html

CommonMark to HTML renderer.

The renderer can be customized to handle abstract syntax tree extensions or override parts of the default rendering, see this example.

Named entity resolution

type entities = string -> string option option

The type for named entity resolvers. Given an entity name n (i.e. without the '&' and ';') if the function returns:

  • None, the entity is unknown. In that case the entity becomes pure text, like in this example.
  • Some None, the entity is known but is not expanded to its definition, it is written as an entity in the output.
  • Some (Some def), the entity is known and def is the UTF-8 encoding of the entity definition.
val std_entities : entities

std_entities resolves named entities according to the CommonMark specification. This replaces all HTML named characters by their UTF-8 text and any other entity becomes pure text in the output like in this example.

val std_entities_verbatim : entities

std_entities_verbatim is like std_entities except it writes the entities that resolve as entities in the output rather than expanding them to their UTF-8 definition.

Rendering

type funs

The type for rendering functions.

val of_doc : ?funs:funs -> entities:entities -> safe:bool -> Cmarkit.t -> string

of_doc ~entity ~funs ~safe d generates an HTML fragment for d.

  • entity is used to resolve `Named entity references, use std_entities for rendering CommonMark according to the spec and see entities for more details.
  • If safe is true a safe rendering is requested. What happens exactly depends on funs. The safe rendering of default rendering functions is described here. Using safe renderings is a good first step at preventing XSS from untrusted user inputs but you may want to delegate that task to a better HTML sanitizer.
  • funs are the rendering functions funs (defaults to funs_default). If funs functions ever return false, funs_unknown is used as a last resort.
val buffer_add_doc : Stdlib.Buffer.t -> ?funs:funs -> entities:entities -> safe:bool -> Cmarkit.t -> unit

buffer_add_doc b ~funs ~safe d is like of_doc but adds the rendering to b.

Renderer

type t

The type for CommonMark HTML renderers.

val safe : t -> bool

safe r is true iff a safe rendering is requested.

val attribute : t -> esc:[ `Html | `Pct ] -> Cmarkit.attribute -> unit

attribute r a renders a like str_html_escaped (esc is `Html) or str_pct_encoded (esc is `Pct) but processes entities that may be present in a.

val entity : t -> Cmarkit.entity -> unit

entity r e renders entity e.

val block : t -> Cmarkit.block -> unit

block r b renders block b on r.

val inline : t -> Cmarkit.inline -> unit

inline r i renders inline i on r.

val str_html_escaped : t -> string -> unit

str_html_escaped r s renders string s on r with HTML markup delimiters '<', '>', '&', '\'' and '"' escaped to HTML entities.

val uchar_html_escaped : t -> Stdlib.Uchar.t -> unit

uchar_html_escaped r u renders the UTF-8 encoding of u on r with HTML markup delimiters '<', '>', '&', '\'' and '"' escaped to HTML entities. This also renders U+0000 to Uchar.rep.

val str_pct_encoded : t -> string -> unit

str_pct_encoded r s renders string s on r with everything percent encoded but unreserved, sub-delims and gen-delims URI characters. In other words only 'a'..'z', 'A'..'Z', '0'..'9', '-', '.', '_', '~' and '!', '$', '&', '\'', '(', ')' '*', '+', ',', ';', '=' and ':', '/', '?', '#', '[', ']', '@' are not percent-encoded. However it also replaces both '&' and '\'' by their corresponding HTML entities.

val byte : t -> char -> unit

byte r c renders the byte b verbatim on r.

val str : t -> string -> unit

str r s renders string s verbatim on r.

Rendering functions

val funs : ?block:(t -> Cmarkit.block -> bool) -> ?inline:(t -> Cmarkit.inline -> bool) -> unit -> funs

funs ~block ~inline () are rendering functions. The block and inline functions:

  • Must return true on values they rendered.
  • Must return false on values they did not render.
  • Must use block and inline with the renderer they receive if they need to invoke the renderer recursively.
  • Should be considerate of safe renderings.
  • Default to fun _ _ -> false.
val funs_block : funs -> t -> Cmarkit.block -> bool

funs_block funs is funs's block rendering function.

val funs_inline : funs -> t -> Cmarkit.inline -> bool

funs_inline funs is funs's inline rendering function.

val compose : funs -> funs -> funs

compose g f renders first with f and if the rendering function returned false falls backs on g.

Predefined rendering functions

val funs_default : funs

funs_default are the default CommonMark rendering functions.

On safe renderings:

  • Raw HTML is discarded and replaced by an HTML comment in the output.
  • URLs that satify Cmarkit.is_unsafe_link are replaced by the empty string.
val funs_xhtml : funs

funs_xhtml is like funs_default but explicitely closes empty tags to possibly make the output valid XML.

val funs_unknown : funs

funs_unknown are the last resort rendering functions. Its functions always render an HTML comment indicating an unknown block or inline value was found and return true.

Custom renderer example

Let's assume you want to:

  1. Extend the abstract syntax tree with a `Doc block which allows to splice documents in another one (note this functionality is already built-in via the Cmarkit.Blocks block case).
  2. Slightly alter default's rendering of thematic breaks to be able to use an XML parser to reparse the result (note formally there are a few other renderings to alter, the funs_xhtml functions already do that for you).
  3. Otherwise use the default renderer.

This boils down to add a new case to the syntax tree, define a new funs value whose block rendering function handles the extension and thematic breaks and invoke them before the default rendering functions by using compose.

type Cmarkit.block += Doc of Cmarkit.t

let custom =
  let block r = function
  | Cmarkit.Thematic_break _ -> Cmarkit_html.str r "<hr/>"; true
  | Doc (blocks, _) ->
      (* Note, it's important to recurse via Cmarkit_html.block *)
      List.iter (Cmarkit_html.block r) blocks; true
  | _ -> false
  in
  let funs = Cmarkit_html.funs ~block () in
  Cmarkit_html.compose Cmarkit_html.std funs

The rendering functions custom can now be used with the funs argument of the of_doc function.