GCSDocumentStorage#

class langchain_google_vertexai.vectorstores.document_storage.GCSDocumentStorage(bucket: Bucket, prefix: str | None = 'documents', threaded=True, n_threads=8)[source]#

Stores documents in Google Cloud Storage. For each pair id, document_text the name of the blob will be {prefix}/{id} stored in plain text format.

Constructor. :param bucket: Bucket where the documents will be stored. :param prefix: Prefix that is prepended to all document names.

Methods

__init__(bucket[, prefix, threaded, n_threads])

Constructor.

amdelete(keys)

Async delete the given keys and their associated values.

amget(keys)

Async get the values associated with the given keys.

amset(key_value_pairs)

Async set the values for the given keys.

ayield_keys(*[, prefix])

Async get an iterator over keys that match the given prefix.

mdelete(keys)

Deletes a batch of documents by id.

mget(keys)

Gets a batch of documents by id.

mset(key_value_pairs)

Stores a series of documents using each keys

yield_keys(*[, prefix])

Yields the keys present in the storage.

Parameters:
  • bucket (storage.Bucket)

  • prefix (Optional[str])

__init__(bucket: Bucket, prefix: str | None = 'documents', threaded=True, n_threads=8) None[source]#

Constructor. :param bucket: Bucket where the documents will be stored. :param prefix: Prefix that is prepended to all document names.

Parameters:
  • bucket (Bucket)

  • prefix (str | None)

Return type:

None

async amdelete(keys: Sequence[K]) None#

Async delete the given keys and their associated values.

Parameters:

keys (Sequence[K]) – A sequence of keys to delete.

Return type:

None

async amget(keys: Sequence[K]) list[V | None]#

Async get the values associated with the given keys.

Parameters:

keys (Sequence[K]) – A sequence of keys.

Returns:

A sequence of optional values associated with the keys. If a key is not found, the corresponding value will be None.

Return type:

list[V | None]

async amset(key_value_pairs: Sequence[tuple[K, V]]) None#

Async set the values for the given keys.

Parameters:

key_value_pairs (Sequence[Tuple[K, V]]) – A sequence of key-value pairs.

Return type:

None

async ayield_keys(*, prefix: str | None = None) AsyncIterator[K] | AsyncIterator[str]#

Async get an iterator over keys that match the given prefix.

Parameters:

prefix (str) – The prefix to match.

Yields:

Iterator[K | str] – An iterator over keys that match the given prefix. This method is allowed to return an iterator over either K or str depending on what makes more sense for the given store.

Return type:

AsyncIterator[K] | AsyncIterator[str]

mdelete(keys: Sequence[str]) None[source]#

Deletes a batch of documents by id.

Parameters:

keys (Sequence[str]) – List of ids for the text.

Return type:

None

mget(keys: Sequence[str]) List[Document | None][source]#

Gets a batch of documents by id. The default implementation only loops get_by_id. Subclasses that have faster ways to retrieve data by batch should implement this method. :param ids: List of ids for the text.

Returns:

List of documents. If the key id is not found for any id record returns a

None instead.

Parameters:

keys (Sequence[str])

Return type:

List[Document | None]

mset(key_value_pairs: Sequence[Tuple[str, Document]]) None[source]#

Stores a series of documents using each keys

Parameters:

key_value_pairs (Sequence[Tuple[K, V]]) – A sequence of key-value pairs.

Return type:

None

yield_keys(*, prefix: str | None = None) Iterator[str][source]#

Yields the keys present in the storage.

Parameters:

prefix (str | None) – Ignored. Uses the prefix provided in the constructor.

Return type:

Iterator[str]