Cmarkit
CommonMark parser and abstract syntax tree.
See examples.
References.
module Textloc : sig ... end
Text locations.
module Meta : sig ... end
Node metadata.
type 'a node = 'a * Meta.t
The type for abstract syntax tree nodes. The data of type 'a
and its metadata.
module Layout : sig ... end
Types for layout information.
module Block_line : sig ... end
Block lines.
module Label : sig ... end
Labels.
module Link_definition : sig ... end
Link definitions.
module Inline : sig ... end
Inlines.
module Block : sig ... end
Blocks.
module Doc : sig ... end
Documents (and parser).
module Mapper : sig ... end
Abstract syntax tree mappers.
module Folder : sig ... end
Abstract syntax tree folders.
For some documents, bare CommonMark just misses it. The extensions are here to make it hit the mark. To enable them use Doc.of_string
with strict:false
.
Please note the following:
Cmarkit
will support those.According to pandoc
.
Strikethrough your ~~perfect~~ imperfect thoughts.
Inline text delimited between two ~~
gets into an Inline.Ext_strikethrough
node.
The text delimited by ~~
cannot start or end with Unicode whitespace. When a closer can close multiple openers, the neareast opener is closed. Strikethrough inlines can be nested.
According to a mix of pandoc
, GLFM, GFM.
This is an inline $\sqrt(x - 1)$ math expression.
Inline text delimited between $
gets into an Inline.Ext_math_span
node.
The text delimited by $
cannot start and end with Unicode whitespace. Inline math cannot be nested, after an opener the nearest (non-escaped) closing delimiter matches. Otherwise it is parsed in essence like a code span.
It's better to get that $$ \left( \sum_{k=1}^n a_k b_k \right)^2 $$ on its own line. A math block may also be more convenient: ```math \left( \sum_{k=1}^n a_k b_k \right)^2 < \Phi ```
Inline text delimited by $$
gets into a Inline.Ext_math_span
with the Inline.Math_span.display
property set to true
. Alternatively code blocks whose language is math
get into in Block.Ext_math_block
blocks.
In contrast to $
, the text delimited by $$
can start and end with whitespace, however it can't contain a blank line. Display math cannot be nested, after an opener the nearest (non-escaped) closing delimiter matches. Otherwise it's parsed in essence like a code span.
According to a mix of md4c, GLFM, GFM and personal ad-hoc brewery.
* [ ] That's unchecked. * [x] That's checked. * [~] That's cancelled.
If a list item starts with up to three space, followed by followed by [
, a single Unicode character, ]
and a space (the space can be omitted if the line is empty, but subsequent indentation considers there was one). The Unicode character gets stored in Block.List_item.ext_task_marker
and counts as one column regardless of the character's render width. The task marker including the final space is considered part of the list marker as far as subsequent indentation is concerned.
The Unicode character indicates the status of the task. That's up to the client but the function Block.List_item.task_status_of_task_marker
which is used by the built-in renderers makes the following choices:
' '
(U+0020).'x'
(U+0078), 'X'
(U+0058), '✓'
(U+2713, CHECK MARK), '✔'
(U+2714, HEAVY CHECK MARK), '𐄂'
(U+10102, AEGEAN CHECK MARK), '🗸'
(U+1F5F8, LIGHT CHECK MARK).'~'
(U+007E).According to djot.
| # | Name | Description | Link | |:-:|----------:|:----------------------|------------------------:| | 1 | OCaml | The OCaml website | <https://ocaml.org> | | 2 | Haskell | The Haskell website | <https://haskell.org> | | 3 | MDN | Web dev docs | <https://developer.mozilla.org/> | | 4 | Wikipedia | The Free Encyclopedia | <https://wikipedia.org> |
A table is a sequence of rows, each row starts and ends with a (non-escaped) pipe |
character. The first row can't be indented by more than three spaces of indentation, subsequent rows can be arbitrarily indented. Blanks after the final pipe are allowed.
Each row of the table contains cells separated by (non-escaped) pipe |
characters. Pipes embedded in inlines constructs do not count as separators (the parsing strategy is to parse the row as an inline, split the result on the |
present in toplevel text nodes and strip initial and trailing blanks in cells). The number of |
separators plus 1 determines the number of columns of a row. The number of columns of a table is the greatest number of columns of its rows.
A separator line is a row in which every cell content is made only of one or more -
optionally prefixed and suffixed by :
. These rows are not data, they indicate alignment of data in their cell for subsequent rows (multiple separator lines in a single table are allowed) and that the previous line (if any) was a row of column headers. :-
is left aligned -:
is right aligned, :-:
is centered. If there's no alignement specified it's left aligned.
Tables are stored in Block.Ext_table
nodes.
According to djot for the footnote contents.
This is a footnote in history[^1] with mutiple references[^1]. Footnotes are not [very special][^1] references. [^1]: Footnotes can have lazy continuation lines and multiple paragraphs. If you start one column after the left bracket, blocks still get into the footnote. But this is no longer the footnote.
Footnotes go through the label resolution mecanism and share the same namespace as link references (including the ^
). They end up being defined in the Doc.defs
as Block.Footnote.Def
definitions. Footnote references are simply made by using Inline.Link
with the corresponding labels.
A footnote definition starts with a (single line) link label followed by :
. The label must start with a ^
. Footnote labels go through the label resolution mechanism.
All subsequent lines indented one column further than the start of the label (i.e. starting on the ^
) get into the footnote. Lazy continuation lines are supported.
The result is stored in the document's Doc.defs
in Block.Footnote.Def
cases and it's position in the documentation witnessed by a Block.Ext_footnote_definition
node which is kept for layout.
Footnote references are simply reference links with the footnote label. Linking text on footnotes is allowed. Shortcut and collapsed references to footnotes are rendered specially by Cmarkit_html
.