CmarkitCommonMark parser and abstract syntax tree.
See examples.
References.
module Textloc : sig ... endText locations.
module Meta : sig ... endNode metadata.
type 'a node = 'a * Meta.tThe type for abstract syntax tree nodes. The data of type 'a and its metadata.
module Layout : sig ... endTypes for layout information.
module Block_line : sig ... endBlock lines.
module Label : sig ... endLabels.
module Link_definition : sig ... endLink definitions.
module Inline : sig ... endInlines.
module Block : sig ... endBlocks.
module Doc : sig ... endDocuments (and parser).
module Mapper : sig ... endAbstract syntax tree mappers.
module Folder : sig ... endAbstract syntax tree folders.
For some documents, bare CommonMark just misses it. The extensions are here to make it hit the mark. To enable them use Doc.of_string with strict:false.
Please note the following:
Cmarkit will support those.According to pandoc.
Strikethrough your ~~perfect~~ imperfect thoughts.
Inline text delimited between two ~~ gets into an Inline.Ext_strikethrough node.
The text delimited by ~~ cannot start or end with Unicode whitespace. When a closer can close multiple openers, the neareast opener is closed. Strikethrough inlines can be nested.
According to a mix of pandoc, GLFM, GFM.
This is an inline $\sqrt(x - 1)$ math expression.
Inline text delimited between $ gets into an Inline.Ext_math_span node.
The text delimited by $ cannot start and end with Unicode whitespace. Inline math cannot be nested, after an opener the nearest (non-escaped) closing delimiter matches. Otherwise it is parsed in essence like a code span.
It's better to get that $$ \left( \sum_{k=1}^n a_k b_k \right)^2 $$
on its own line. A math block may also be more convenient:
```math
\left( \sum_{k=1}^n a_k b_k \right)^2 < \Phi
```Inline text delimited by $$ gets into a Inline.Ext_math_span with the Inline.Math_span.display property set to true. Alternatively code blocks whose language is math get into in Block.Ext_math_block blocks.
In contrast to $, the text delimited by $$ can start and end with whitespace, however it can't contain a blank line. Display math cannot be nested, after an opener the nearest (non-escaped) closing delimiter matches. Otherwise it's parsed in essence like a code span.
According to a mix of md4c, GLFM, GFM and personal ad-hoc brewery.
* [ ] That's unchecked. * [x] That's checked. * [~] That's cancelled.
If a list item starts with up to three space, followed by followed by [, a single Unicode character, ] and a space (the space can be omitted if the line is empty, but subsequent indentation considers there was one). The Unicode character gets stored in Block.List_item.ext_task_marker and counts as one column regardless of the character's render width. The task marker including the final space is considered part of the list marker as far as subsequent indentation is concerned.
The Unicode character indicates the status of the task. That's up to the client but the function Block.List_item.task_status_of_task_marker which is used by the built-in renderers makes the following choices:
' ' (U+0020).'x' (U+0078), 'X' (U+0058), '✓' (U+2713, CHECK MARK), '✔' (U+2714, HEAVY CHECK MARK), '𐄂' (U+10102, AEGEAN CHECK MARK), '🗸' (U+1F5F8, LIGHT CHECK MARK).'~' (U+007E).According to djot.
| # | Name | Description | Link | |:-:|----------:|:----------------------|------------------------:| | 1 | OCaml | The OCaml website | <https://ocaml.org> | | 2 | Haskell | The Haskell website | <https://haskell.org> | | 3 | MDN | Web dev docs | <https://developer.mozilla.org/> | | 4 | Wikipedia | The Free Encyclopedia | <https://wikipedia.org> |
A table is a sequence of rows, each row starts and ends with a (non-escaped) pipe | character. The first row can't be indented by more than three spaces of indentation, subsequent rows can be arbitrarily indented. Blanks after the final pipe are allowed.
Each row of the table contains cells separated by (non-escaped) pipe | characters. Pipes embedded in inlines constructs do not count as separators (the parsing strategy is to parse the row as an inline, split the result on the | present in toplevel text nodes and strip initial and trailing blanks in cells). The number of | separators plus 1 determines the number of columns of a row. The number of columns of a table is the greatest number of columns of its rows.
A separator line is a row in which every cell content is made only of one or more - optionally prefixed and suffixed by :. These rows are not data, they indicate alignment of data in their cell for subsequent rows (multiple separator lines in a single table are allowed) and that the previous line (if any) was a row of column headers. :- is left aligned -: is right aligned, :-: is centered. If there's no alignement specified it's left aligned.
Tables are stored in Block.Ext_table nodes.
According to djot for the footnote contents.
This is a footnote in history[^1] with mutiple references[^1]. Footnotes are not [very special][^1] references. [^1]: Footnotes can have lazy continuation lines and multiple paragraphs. If you start one column after the left bracket, blocks still get into the footnote. But this is no longer the footnote.
Footnotes go through the label resolution mecanism and share the same namespace as link references (including the ^). They end up being defined in the Doc.defs as Block.Footnote.Def definitions. Footnote references are simply made by using Inline.Link with the corresponding labels.
A footnote definition starts with a (single line) link label followed by :. The label must start with a ^. Footnote labels go through the label resolution mechanism.
All subsequent lines indented one column further than the start of the label (i.e. starting on the ^) get into the footnote. Lazy continuation lines are supported.
The result is stored in the document's Doc.defs in Block.Footnote.Def cases and it's position in the documentation witnessed by a Block.Ext_footnote_definition node which is kept for layout.
Footnote references are simply reference links with the footnote label. Linking text on footnotes is allowed. Shortcut and collapsed references to footnotes are rendered specially by Cmarkit_html.