Html2TextTransformer#

class langchain_community.document_transformers.html2text.Html2TextTransformer(ignore_links: bool = True, ignore_images: bool = True)[source]#

Replace occurrences of a particular search pattern with a replacement string

Parameters:
  • ignore_links (bool) – Whether links should be ignored; defaults to True.

  • ignore_images (bool) – Whether images should be ignored; defaults to True.

Example

Methods

__init__([ignore_links,Β ignore_images])

atransform_documents(documents,Β **kwargs)

Asynchronously transform a list of documents.

transform_documents(documents,Β **kwargs)

Transform a list of documents.

__init__(ignore_links: bool = True, ignore_images: bool = True) β†’ None[source]#
Parameters:
  • ignore_links (bool)

  • ignore_images (bool)

Return type:

None

async atransform_documents(documents: Sequence[Document], **kwargs: Any) β†’ Sequence[Document][source]#

Asynchronously transform a list of documents.

Parameters:
  • documents (Sequence[Document]) – A sequence of Documents to be transformed.

  • kwargs (Any)

Returns:

A sequence of transformed Documents.

Return type:

Sequence[Document]

transform_documents(documents: Sequence[Document], **kwargs: Any) β†’ Sequence[Document][source]#

Transform a list of documents.

Parameters:
  • documents (Sequence[Document]) – A sequence of Documents to be transformed.

  • kwargs (Any)

Returns:

A sequence of transformed Documents.

Return type:

Sequence[Document]

Examples using Html2TextTransformer