Cmarkit_base.Text
Textual content.
Ensures UTF-8 validity, unescapes, resolves numeric and named character references.
utf_8_clean_unesc_unref b s ~first ~last
unescapes CommonMark escapes, resolves HTML entity and character references in the given span and replaces U+0000 and UTF-8 decoding errors by Uchar.rep
. b
is used as scratch space. If last > first
or first
and last
are not valid indices of s
is ""
.
utf_8_clean_unref b s ~first ~last
is like utf_8_clean_unesc_unref
but does not unsescape.
utf_8_clean_raw b s ~first ~last
replaces U+0000 and UTF-8 decoding errors by Uchar.rep
. b
is used as scratch space. pad
(defaults to 0
) specifies a number of U+0020 spaces to prepend. If last > first
or first
and last
are not valid indices of s
is either ""
or the padded string.