GCSDocumentStorage#

class langchain_google_vertexai.vectorstores.document_storage.GCSDocumentStorage(
bucket: Bucket,
prefix: str | None = 'documents',
threaded=True,
n_threads=8,
)[source]#

Stores documents in Google Cloud Storage. For each pair id, document_text the name of the blob will be {prefix}/{id} stored in plain text format.

Constructor.

Parameters:
  • bucket (storage.Bucket) โ€“ Bucket where the documents will be stored.

  • prefix (Optional[str]) โ€“ Prefix that is prepended to all document names.

Methods

__init__(bucket[,ย prefix,ย threaded,ย n_threads])

Constructor.

amdelete(keys)

Async delete the given keys and their associated values.

amget(keys)

Async get the values associated with the given keys.

amset(key_value_pairs)

Async set the values for the given keys.

ayield_keys(*[,ย prefix])

Async get an iterator over keys that match the given prefix.

mdelete(keys)

Deletes a batch of documents by id.

mget(keys)

Gets a batch of documents by id.

mset(key_value_pairs)

Stores a series of documents using each keys.

yield_keys(*[,ย prefix])

Yields the keys present in the storage.

__init__(
bucket: Bucket,
prefix: str | None = 'documents',
threaded=True,
n_threads=8,
) โ†’ None[source]#

Constructor.

Parameters:
  • bucket (Bucket) โ€“ Bucket where the documents will be stored.

  • prefix (str | None) โ€“ Prefix that is prepended to all document names.

Return type:

None

async amdelete(
keys: Sequence[K],
) โ†’ None#

Async delete the given keys and their associated values.

Parameters:

keys (Sequence[K]) โ€“ A sequence of keys to delete.

Return type:

None

async amget(
keys: Sequence[K],
) โ†’ list[V | None]#

Async get the values associated with the given keys.

Parameters:

keys (Sequence[K]) โ€“ A sequence of keys.

Returns:

A sequence of optional values associated with the keys. If a key is not found, the corresponding value will be None.

Return type:

list[V | None]

async amset(
key_value_pairs: Sequence[tuple[K, V]],
) โ†’ None#

Async set the values for the given keys.

Parameters:

key_value_pairs (Sequence[tuple[K, V]]) โ€“ A sequence of key-value pairs.

Return type:

None

async ayield_keys(
*,
prefix: str | None = None,
) โ†’ AsyncIterator[K] | AsyncIterator[str]#

Async get an iterator over keys that match the given prefix.

Parameters:

prefix (str) โ€“ The prefix to match.

Yields:

Iterator[K | str] โ€“ An iterator over keys that match the given prefix. This method is allowed to return an iterator over either K or str depending on what makes more sense for the given store.

Return type:

AsyncIterator[K] | AsyncIterator[str]

mdelete(
keys: Sequence[str],
) โ†’ None[source]#

Deletes a batch of documents by id.

Parameters:

keys (Sequence[str]) โ€“ List of ids for the text.

Return type:

None

mget(
keys: Sequence[str],
) โ†’ List[Document | None][source]#

Gets a batch of documents by id. The default implementation only loops get_by_id. Subclasses that have faster ways to retrieve data by batch should implement this method.

Parameters:
  • ids โ€“ List of ids for the text.

  • keys (Sequence[str])

Returns:

List of documents. If the key id is not found for any id record returns a

None instead.

Return type:

List[Document | None]

mset(
key_value_pairs: Sequence[Tuple[str, Document]],
) โ†’ None[source]#

Stores a series of documents using each keys.

Parameters:

key_value_pairs (Sequence[Tuple[K, V]]) โ€“ A sequence of key-value pairs.

Return type:

None

yield_keys(
*,
prefix: str | None = None,
) โ†’ Iterator[str][source]#

Yields the keys present in the storage.

Parameters:

prefix (str | None) โ€“ Ignored. Uses the prefix provided in the constructor.

Return type:

Iterator[str]