Skip to main content

HTML to text

html2text is a Python package that converts a page of HTML into clean, easy-to-read plain ASCII text.

The ASCII also happens to be a valid Markdown (a text-to-HTML format).

Installation and Setup​

pip install html2text

Document Transformer​

See a usage example.

from langchain_community.document_loaders import Html2TextTransformer

Help us out by providing feedback on this documentation page: