O365BaseLoader#

class langchain_community.document_loaders.base_o365.O365BaseLoader[source]#

Bases: BaseLoader, BaseModel

Base class for all loaders that uses O365 Package

param auth_with_token: bool = False#

Whether to authenticate with a token or not. Defaults to False.

param chunk_size: int | str = 5242880#

Number of bytes to retrieve from each api call to the server. int or ‘auto’.

param handlers: Dict[str, Any] | None = {}#

Provide custom handlers for MimeTypeBasedParser.

Pass a dictionary mapping either file extensions (like “doc”, “pdf”, etc.) or MIME types (like “application/pdf”, “text/plain”, etc.) to parsers. Note that you must use either file extensions or MIME types exclusively and cannot mix them.

Do not include the leading dot for file extensions.

Example using file extensions: ```python

handlers = {

“doc”: MsWordParser(), “pdf”: PDFMinerParser(), “txt”: TextParser()

}

```

Example using MIME types: ```python

handlers = {

“application/msword”: MsWordParser(), “application/pdf”: PDFMinerParser(), “text/plain”: TextParser()

}

```

param modified_since: datetime | None = None#

Only fetch documents modified since given datetime. The datetime object must be timezone aware.

param recursive: bool = False#

Should the loader recursively load subfolders?

param settings: _O365Settings [Optional]#

Settings for the Office365 API client.

async alazy_load() AsyncIterator[Document]#

A lazy loader for Documents.

Return type:

AsyncIterator[Document]

async aload() list[Document]#

Load data into Document objects.

Return type:

list[Document]

lazy_load() Iterator[Document]#

A lazy loader for Documents.

Return type:

Iterator[Document]

load() list[Document]#

Load data into Document objects.

Return type:

list[Document]

load_and_split(text_splitter: TextSplitter | None = None) list[Document]#

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters:

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns:

List of Documents.

Return type:

list[Document]