Cmarkit_html
Rendering CommonMark to HTML.
Generates HTML fragments, consult the integration notes for requirements on the webpage.
See a quick example and another one.
Warning. Rendering outputs are unstable, they may be tweaked even between minor versions of the library.
val of_doc : ?backend_blocks:bool -> safe:bool -> Cmarkit.Doc.t -> string
of_doc ~safe d
is an HTML fragment for d
. See renderer
for more details and documentation about rendering options.
val renderer : ?backend_blocks:bool -> safe:bool -> unit -> Cmarkit_renderer.t
renderer ~safe ()
is the default HTML renderer. This renders the strict CommonMark abstract syntax tree and the supported Cmarkit extensions.
The inline, block and document renderers always return true
. Unknown block and inline values are rendered by an HTML comment.
The following options are available:
safe
, if true
HTML blocks and raw HTML inlines are discarded and replaced by an HTML comment in the output. Besides the URLs of autolinks, links and images that satisfy Cmarkit.Inline.Link.is_unsafe
are replaced by the empty string.
Using safe renderings is a good first step at preventing XSS from untrusted user inputs but you should rather post-process rendering outputs with a dedicated HTML sanitizer.
backend_blocks
, if true
, code blocks with language =html
are written verbatim in the output (iff safe
is true
) and any other code block whose langage starts with =
is dropped. Defaults to false
.See this example to extend or selectively override the renderer.
val xhtml_renderer :
?backend_blocks:bool ->
safe:bool ->
unit ->
Cmarkit_renderer.t
xhtml_renderer
is like renderer
but explicitely closes empty tags to possibly make the output valid XML. Note that it still renders HTML blocks and inline raw HTML unless safe
is true
(which also suppresses some URLs).
See this example to extend or selectively override the renderer.
Only useful if you extend the renderer.
val safe : Cmarkit_renderer.context -> bool
safe c
is true
if a safe rendering is requested. See renderer
for more information.
val html_escaped_uchar : Cmarkit_renderer.context -> Stdlib.Uchar.t -> unit
html_escaped_uchar c u
renders the UTF-8 encoding of u
on c
with HTML markup delimiters <
>
&
and "
escaped to HTML entities (Single quotes '
are not escaped use "
to delimit your attributes). This also renders U+0000 to Uchar.rep
.
buffer_add_html_escaped_uchar
is html_escaped_uchar
but appends to a buffer value.
val html_escaped_string : Cmarkit_renderer.context -> string -> unit
html_escaped_string c s
renders string s
on c
with HTML markup delimiters <
, >
, &
, and "
escaped to HTML entities (Single quotes '
are not escaped, use "
to delimit your attributes).
buffer_add_html_escaped_string
is html_escaped_string
but appends to a buffer value.
val pct_encoded_string : Cmarkit_renderer.context -> string -> unit
pct_encoded_string c s
renders string s
on c
with everything percent encoded except %
and the unreserved
, sub-delims
and the gen-delims
URI characters except brackets [
and ]
(to match the cmark
tool).
In other words only characters %
a-z
A-Z
0-9
-
.
_
~
!
$
&
'
(
)
*
+
,
;
=
:
/
?
#
@
are not percent-encoded.
Warning. The function also replaces both &
and '
by their corresponding HTML entities, so you can't use this in a context that doesn't allow entities. Besides this assumes s
may already have percent encoded bits so it doesn't percent encode %
, as such you can't use this as a general percent encode function.
buffer_add_pct_encoded_string b s
is pct_encoded_string
but appends to a buffer value.
If a language lang
can be extracted from the info string of a code block with Cmarkit.Block.Code_block.language_of_info_string
, a language-lang
class is added to the corresponding code
element. If you want to highlight the syntax, adding highlight.js to your page is an option.
Headings identifiers and anchors are added to the output whenever Cmarkit.Block.Heading.id
holds a value. If the identifier already exists it is made unique by appending "-"
and the first number starting from 1 that makes it unique.
If your document has Cmarkit.Inline.Ext_math_span
inlines or Cmarkit.Block.Ext_math_block
blocks, the default renderer outputs them in \(
, \)
and \[
, \]
delimiters. You should add KATEX or MathJax in your page to let these bits be rendered by the typography they deserve.
The default renderers only generate HTML fragments. You may want to add a page frame. For example:
let html_doc_of_md ?(lang = "en") ~title ~safe md =
let doc = Cmarkit.Doc.of_string md in
let r = Cmarkit_html.renderer ~safe () in
let buffer_add_doc = Cmarkit_renderer.buffer_add_doc r in
let buffer_add_title = Cmarkit_html.buffer_add_html_escaped_string in
Printf.kbprintf Buffer.contents (Buffer.create 1024)
{|<html lang="%s">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>%a</title>
</head>
<body>
%a</body>
</html>|}
lang buffer_add_title title buffer_add_doc doc