md
Markdown specific module.
Hint
Use pip to install the necessary dependencies for this module:
pip install mltb2[md]
- class mltb2.md.MdTextSplitter(max_token: int, transformers_token_counter: TransformersTokenCounter, show_progress_bar: bool = False)[source]
Bases:
object
Split Markdown text into sections with a specified maximum token number.
Does not divide headings with their corresponding paragraphs.
- Parameters:
max_token (int) – Maximum number of tokens per text section. Can only be exceeded if a single Markdown chunk is already larger.
transformers_token_counter (TransformersTokenCounter) – The token counter to be used.
show_progress_bar (bool) – Show a progressbar during processing.