FastEmbedEmbeddings#

class langchain_community.embeddings.fastembed.FastEmbedEmbeddings[source]#

Bases: BaseModel, Embeddings

Qdrant FastEmbedding models.

FastEmbed is a lightweight, fast, Python library built for embedding generation. See more documentation at: * qdrant/fastembed * https://qdrant.github.io/fastembed/

To use this class, you must install the fastembed Python package.

pip install fastembed .. rubric:: Example

from langchain_community.embeddings import FastEmbedEmbeddings fastembed = FastEmbedEmbeddings()

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

param batch_size: int = 256#

Batch size for encoding. Higher values will use more memory, but be faster. Defaults to 256.

param cache_dir: str | None = None#

The path to the cache directory. Defaults to local_cache in the parent directory

param doc_embed_type: Literal['default', 'passage'] = 'default'#

Type of embedding to use for documents The available options are: β€œdefault” and β€œpassage”

param max_length: int = 512#

The maximum number of tokens. Defaults to 512. Unknown behavior for values > 512.

param model: Any = None#
param model_name: str = 'BAAI/bge-small-en-v1.5'#

Name of the FastEmbedding model to use Defaults to β€œBAAI/bge-small-en-v1.5” Find the list of supported models at https://qdrant.github.io/fastembed/examples/Supported_Models/

param parallel: int | None = None#

If >1, parallel encoding is used, recommended for encoding of large datasets. If 0, use all available cores. If None, don’t use data-parallel processing, use default onnxruntime threading. Defaults to None.

param threads: int | None = None#

The number of threads single onnxruntime session can use. Defaults to None

async aembed_documents(texts: list[str]) β†’ list[list[float]]#

Asynchronous Embed search docs.

Parameters:

texts (list[str]) – List of text to embed.

Returns:

List of embeddings.

Return type:

list[list[float]]

async aembed_query(text: str) β†’ list[float]#

Asynchronous Embed query text.

Parameters:

text (str) – Text to embed.

Returns:

Embedding.

Return type:

list[float]

embed_documents(texts: List[str]) β†’ List[List[float]][source]#

Generate embeddings for documents using FastEmbed.

Parameters:

texts (List[str]) – The list of texts to embed.

Returns:

List of embeddings, one for each text.

Return type:

List[List[float]]

embed_query(text: str) β†’ List[float][source]#

Generate query embeddings using FastEmbed.

Parameters:

text (str) – The text to embed.

Returns:

Embeddings for the text.

Return type:

List[float]

classmethod validate_environment(values: Dict) β†’ Dict[source]#

Validate that FastEmbed has been installed.

Parameters:

values (Dict)

Return type:

Dict

Examples using FastEmbedEmbeddings