Module Cmarkit_html

Rendering CommonMark to HTML.

Generates HTML fragments, consult the integration notes for requirements on the webpage.

See a quick example and another one.

Warning. Rendering outputs are unstable, they may be tweaked even between minor versions of the library.

Rendering

val of_doc : ?backend_blocks:bool -> safe:bool -> Cmarkit.Doc.t -> string

of_doc ~safe d is an HTML fragment for d. See renderer for more details and documentation about rendering options.

Renderers

val renderer : ?backend_blocks:bool -> safe:bool -> unit -> Cmarkit_renderer.t

renderer ~safe () is the default HTML renderer. This renders the strict CommonMark abstract syntax tree and the supported Cmarkit extensions.

The inline, block and document renderers always return true. Unknown block and inline values are rendered by an HTML comment.

The following options are available:

  • safe, if true HTML blocks and raw HTML inlines are discarded and replaced by an HTML comment in the output. Besides the URLs of autolinks, links and images that satisfy Cmarkit.Inline.Link.is_unsafe are replaced by the empty string.

    Using safe renderings is a good first step at preventing XSS from untrusted user inputs but you should rather post-process rendering outputs with a dedicated HTML sanitizer.

  • backend_blocks, if true, code blocks with language =html are written verbatim in the output (iff safe is true) and any other code block whose langage starts with = is dropped. Defaults to false.

See this example to extend or selectively override the renderer.

val xhtml_renderer : ?backend_blocks:bool -> safe:bool -> unit -> Cmarkit_renderer.t

xhtml_renderer is like renderer but explicitely closes empty tags to possibly make the output valid XML. Note that it still renders HTML blocks and inline raw HTML unless safe is true (which also suppresses some URLs).

See this example to extend or selectively override the renderer.

Render functions

Only useful if you extend the renderer.

val safe : Cmarkit_renderer.context -> bool

safe c is true if a safe rendering is requested. See renderer for more information.

val html_escaped_uchar : Cmarkit_renderer.context -> Stdlib.Uchar.t -> unit

html_escaped_uchar c u renders the UTF-8 encoding of u on c with HTML markup delimiters <>& and " escaped to HTML entities (Single quotes ' are not escaped use " to delimit your attributes). This also renders U+0000 to Uchar.rep.

val buffer_add_html_escaped_uchar : Stdlib.Buffer.t -> Stdlib.Uchar.t -> unit

buffer_add_html_escaped_uchar is html_escaped_uchar but appends to a buffer value.

val html_escaped_string : Cmarkit_renderer.context -> string -> unit

html_escaped_string c s renders string s on c with HTML markup delimiters <, >, &, and " escaped to HTML entities (Single quotes ' are not escaped, use " to delimit your attributes).

val buffer_add_html_escaped_string : Stdlib.Buffer.t -> string -> unit

buffer_add_html_escaped_string is html_escaped_string but appends to a buffer value.

val pct_encoded_string : Cmarkit_renderer.context -> string -> unit

pct_encoded_string c s renders string s on c with everything percent encoded except % and the unreserved, sub-delims and the gen-delims URI characters except brackets [ and ] (to match the cmark tool).

In other words only characters %a-zA-Z0-9-._~! $&'()*+,;=:/?#@ are not percent-encoded.

Warning. The function also replaces both & and ' by their corresponding HTML entities, so you can't use this in a context that doesn't allow entities. Besides this assumes s may already have percent encoded bits so it doesn't percent encode %, as such you can't use this as a general percent encode function.

val buffer_add_pct_encoded_string : Stdlib.Buffer.t -> string -> unit

buffer_add_pct_encoded_string b s is pct_encoded_string but appends to a buffer value.

HTML integration notes

Code blocks

If a language lang can be extracted from the info string of a code block with Cmarkit.Block.Code_block.language_of_info_string, a language-lang class is added to the corresponding code element. If you want to highlight the syntax, adding highlight.js to your page is an option.

Heading identifiers

Headings identifiers and anchors are added to the output whenever Cmarkit.Block.Heading.id holds a value. If the identifier already exists it is made unique by appending "-" and the first number starting from 1 that makes it unique.

Maths

If your document has Cmarkit.Inline.Ext_math_span inlines or Cmarkit.Block.Ext_math_block blocks, the default renderer outputs them in \(, \) and \[, \] delimiters. You should add KATEX or MathJax in your page to let these bits be rendered by the typography they deserve.

Page frame

The default renderers only generate HTML fragments. You may want to add a page frame. For example:

let html_doc_of_md ?(lang = "en") ~title ~safe md =
  let doc = Cmarkit.Doc.of_string md in
  let r = Cmarkit_html.renderer ~safe () in
  let buffer_add_doc = Cmarkit_renderer.buffer_add_doc r in
  let buffer_add_title = Cmarkit_html.buffer_add_html_escaped_string in
  Printf.kbprintf Buffer.contents (Buffer.create 1024)
{|<html lang="%s">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>%a</title>
</head>
<body>
%a</body>
</html>|}
    lang buffer_add_title title buffer_add_doc doc