DocumentLoaderAsParser#

class langchain_community.document_loaders.parsers.documentloader_adapter.DocumentLoaderAsParser(document_loader_class: Type[BaseLoader], **kwargs: Any)[source]#

Beta

This feature is in beta. It is actively being worked on, so the API may change.

A wrapper class that adapts a document loader to function as a parser.

This class is a work-around that adapts a document loader to function as a parser. It is recommended to use a proper parser, if available.

Requires the document loader to accept a file_path parameter.

Initializes the DocumentLoaderAsParser with a specific document loader class and additional arguments.

Parameters:
  • document_loader_class (Type[BaseLoader]) – The document loader class to adapt

  • parser. (as a)

  • **kwargs – Additional arguments passed to the document loader’s constructor.

Raises:

TypeError – If the specified document loader does not accept a file_path parameter, an exception is raised, as only loaders with this parameter can be adapted.

Example

``` from langchain_community.document_loaders.excel import UnstructuredExcelLoader

# Initialize parser adapter with a document loader excel_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode=”elements”) ```

Attributes

Methods

__init__(document_loader_class, **kwargs)

Initializes the DocumentLoaderAsParser with a specific document loader class and additional arguments.

lazy_parse(blob)

Use underlying DocumentLoader to lazily parse the blob.

parse(blob)

Eagerly parse the blob into a document or documents.

__init__(document_loader_class: Type[BaseLoader], **kwargs: Any) None[source]#

Initializes the DocumentLoaderAsParser with a specific document loader class and additional arguments.

Parameters:
  • document_loader_class (Type[BaseLoader]) – The document loader class to adapt

  • parser. (as a)

  • **kwargs – Additional arguments passed to the document loader’s constructor.

Raises:

TypeError – If the specified document loader does not accept a file_path parameter, an exception is raised, as only loaders with this parameter can be adapted.

Return type:

None

Example

``` from langchain_community.document_loaders.excel import UnstructuredExcelLoader

# Initialize parser adapter with a document loader excel_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode=”elements”) ```

lazy_parse(blob: Blob) Iterator[Document][source]#

Use underlying DocumentLoader to lazily parse the blob.

Parameters:

blob (Blob)

Return type:

Iterator[Document]

parse(blob: Blob) list[Document]#

Eagerly parse the blob into a document or documents.

This is a convenience method for interactive development environment.

Production applications should favor the lazy_parse method instead.

Subclasses should generally not over-ride this parse method.

Parameters:

blob (Blob) – Blob instance

Returns:

List of documents

Return type:

list[Document]