Skip to main content
Open In ColabOpen on GitHub

AWS S3 Directory

Amazon Simple Storage Service (Amazon S3) is an object storage service

AWS S3 Directory

This covers how to load document objects from an AWS S3 Directory object.

%pip install --upgrade --quiet  boto3
from langchain_community.document_loaders import S3DirectoryLoader
API Reference:S3DirectoryLoader
loader = S3DirectoryLoader("testing-hwc")
loader.load()

Specifying a prefix

You can also specify a prefix for more finegrained control over what files to load.

loader = S3DirectoryLoader("testing-hwc", prefix="fake")
loader.load()
[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]

Configuring the AWS Boto3 client

You can configure the AWS Boto3 client by passing named arguments when creating the S3DirectoryLoader. This is useful for instance when AWS credentials can't be set as environment variables. See the list of parameters that can be configured.

loader = S3DirectoryLoader(
"testing-hwc", aws_access_key_id="xxxx", aws_secret_access_key="yyyy"
)
loader.load()

Was this page helpful?