PebbloSafeLoader#

class langchain_community.document_loaders.pebblo.PebbloSafeLoader(langchain_loader: BaseLoader, name: str, owner: str = '', description: str = '', api_key: str | None = None, load_semantic: bool = False, classifier_url: str | None = None, *, classifier_location: str = 'local', anonymize_snippets: bool = False)[source]#

Pebblo Safe Loader class is a wrapper around document loaders enabling the data to be scrutinized.

Methods

__init__(langchain_loader,Β name[,Β owner,Β ...])

alazy_load()

A lazy loader for Documents.

aload()

Load data into Document objects.

classify_in_batches()

Classify documents in batches.

lazy_load()

Load documents in lazy fashion.

load()

Load Documents.

load_and_split([text_splitter])

Load Documents and split into chunks.

set_discover_sent()

Parameters:
  • langchain_loader (BaseLoader)

  • name (str)

  • owner (str)

  • description (str)

  • api_key (str | None)

  • load_semantic (bool)

  • classifier_url (str | None)

  • classifier_location (str)

  • anonymize_snippets (bool)

__init__(langchain_loader: BaseLoader, name: str, owner: str = '', description: str = '', api_key: str | None = None, load_semantic: bool = False, classifier_url: str | None = None, *, classifier_location: str = 'local', anonymize_snippets: bool = False)[source]#
Parameters:
  • langchain_loader (BaseLoader)

  • name (str)

  • owner (str)

  • description (str)

  • api_key (str | None)

  • load_semantic (bool)

  • classifier_url (str | None)

  • classifier_location (str)

  • anonymize_snippets (bool)

async alazy_load() β†’ AsyncIterator[Document]#

A lazy loader for Documents.

Return type:

AsyncIterator[Document]

async aload() β†’ list[Document]#

Load data into Document objects.

Return type:

list[Document]

classify_in_batches() β†’ None[source]#

Classify documents in batches. This is to avoid API timeouts when sending large number of documents. Batches are generated based on the page_content size.

Return type:

None

lazy_load() β†’ Iterator[Document][source]#

Load documents in lazy fashion.

Raises:
  • NotImplementedError – raised when lazy_load id not implemented

  • within wrapped loader. –

Yields:

list – Documents from loader’s lazy loading.

Return type:

Iterator[Document]

load() β†’ List[Document][source]#

Load Documents.

Returns:

Documents fetched from load method of the wrapped loader.

Return type:

list

load_and_split(text_splitter: TextSplitter | None = None) β†’ list[Document]#

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters:

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns:

List of Documents.

Return type:

list[Document]

classmethod set_discover_sent() β†’ None[source]#
Return type:

None

Examples using PebbloSafeLoader