FastEmbedSparse#

class langchain_qdrant.fastembed_sparse.FastEmbedSparse(

model_name: str = 'Qdrant/bm25',

batch_size: int = 256,

cache_dir: str | None = None,

threads: int | None = None,

providers: Sequence[Any] | None = None,

parallel: int | None = None,

**kwargs: Any,

)[source]#

An interface for sparse embedding models to use with Qdrant.

Sparse encoder implementation using FastEmbed - https://qdrant.github.io/fastembed/ For a list of available models, see https://qdrant.github.io/fastembed/examples/Supported_Models/

Parameters:

model_name (str) – The name of the model to use. Defaults to “Qdrant/bm25”.
batch_size (int) – Batch size for encoding. Defaults to 256.
cache_dir (str, optional) – The path to the model cache directory. Can also be set using the FASTEMBED_CACHE_PATH env variable.
threads (int, optional) – The number of threads onnxruntime session can use.
providers (Sequence[Any], optional) – List of ONNX execution providers. parallel (int, optional): If >1, data-parallel encoding will be used, r Recommended for encoding of large datasets. If 0, use all available cores. If None, don’t use data-parallel processing, use default onnxruntime threading instead. Defaults to None.
kwargs (Any) – Additional options to pass to fastembed.SparseTextEmbedding
parallel (int | None)

Raises:

ValueError – If the model_name is not supported in SparseTextEmbedding.

Methods

`__init__`([model_name, batch_size, ...])	Sparse encoder implementation using FastEmbed - https://qdrant.github.io/fastembed/ For a list of available models, see https://qdrant.github.io/fastembed/examples/Supported_Models/
`aembed_documents`(texts)	Asynchronous Embed search docs.
`aembed_query`(text)	Asynchronous Embed query text.
`embed_documents`(texts)	Embed search docs.
`embed_query`(text)	Embed query text.

__init__(

model_name: str = 'Qdrant/bm25',

batch_size: int = 256,

cache_dir: str | None = None,

threads: int | None = None,

providers: Sequence[Any] | None = None,

parallel: int | None = None,

**kwargs: Any,

) → None[source]#

Sparse encoder implementation using FastEmbed - https://qdrant.github.io/fastembed/ For a list of available models, see https://qdrant.github.io/fastembed/examples/Supported_Models/

Parameters:

model_name (str) – The name of the model to use. Defaults to “Qdrant/bm25”.
batch_size (int) – Batch size for encoding. Defaults to 256.
cache_dir (str, optional) – The path to the model cache directory. Can also be set using the FASTEMBED_CACHE_PATH env variable.
threads (int, optional) – The number of threads onnxruntime session can use.
providers (Sequence[Any], optional) – List of ONNX execution providers. parallel (int, optional): If >1, data-parallel encoding will be used, r Recommended for encoding of large datasets. If 0, use all available cores. If None, don’t use data-parallel processing, use default onnxruntime threading instead. Defaults to None.
kwargs (Any) – Additional options to pass to fastembed.SparseTextEmbedding
parallel (int | None)

Raises:

ValueError – If the model_name is not supported in SparseTextEmbedding.

Return type:

None

async aembed_documents( texts: list[str], ) → list[SparseVector]#

Asynchronous Embed search docs.

Parameters:: texts (list[str])
Return type:: list[SparseVector]

async aembed_query( text: str, ) → SparseVector#

Asynchronous Embed query text.

Parameters:: text (str)
Return type:: SparseVector

embed_documents( texts: list[str], ) → list[SparseVector][source]#

Embed search docs.

Parameters:: texts (list[str])
Return type:: list[SparseVector]

embed_query( text: str, ) → SparseVector[source]#

Embed query text.

Parameters:: text (str)
Return type:: SparseVector