O365BaseLoader#
- class langchain_community.document_loaders.base_o365.O365BaseLoader[source]#
Bases:
BaseLoader
,BaseModel
Base class for all loaders that uses O365 Package
- param auth_with_token: bool = False#
Whether to authenticate with a token or not. Defaults to False.
- param chunk_size: int | str = 5242880#
Number of bytes to retrieve from each api call to the server. int or ‘auto’.
- param handlers: Dict[str, Any] | None = {}#
Provide custom handlers for MimeTypeBasedParser.
Pass a dictionary mapping either file extensions (like “doc”, “pdf”, etc.) or MIME types (like “application/pdf”, “text/plain”, etc.) to parsers. Note that you must use either file extensions or MIME types exclusively and cannot mix them.
Do not include the leading dot for file extensions.
Example using file extensions: ```python
- handlers = {
“doc”: MsWordParser(), “pdf”: PDFMinerParser(), “txt”: TextParser()
}
Example using MIME types: ```python
- handlers = {
“application/msword”: MsWordParser(), “application/pdf”: PDFMinerParser(), “text/plain”: TextParser()
}
- param modified_since: datetime | None = None#
Only fetch documents modified since given datetime. The datetime object must be timezone aware.
- param recursive: bool = False#
Should the loader recursively load subfolders?
- param settings: _O365Settings [Optional]#
Settings for the Office365 API client.
- async alazy_load() AsyncIterator[Document] #
A lazy loader for Documents.
- Return type:
AsyncIterator[Document]
- load_and_split(text_splitter: TextSplitter | None = None) list[Document] #
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters:
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns:
List of Documents.
- Return type:
list[Document]