EmbeddingsVectorizer#
- class langchain_redis.cache.EmbeddingsVectorizer[source]#
Bases:
BaseVectorizer
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- param cache: EmbeddingsCache | None = None#
- param dims: Annotated[int | None, Field(strict=True, gt=0)] = None#
- Constraints:
strict = True
gt = 0
- param dtype: str = 'float32'#
- param embeddings: Embeddings [Required]#
- param model: str = 'custom_embeddings'#
- async aembed(
- text: str,
- dtype: str | VectorDataType = 'float32',
- **kwargs: Any,
Asynchronously generate a vector embedding for a text string.
- Parameters:
text (str) – The text to convert to a vector embedding
preprocess – Function to apply to the text before embedding
as_buffer – Return the embedding as a binary buffer instead of a list
skip_cache – Bypass the cache for this request
**kwargs (Any) – Additional model-specific parameters
dtype (str | VectorDataType)
**kwargs
- Returns:
The vector embedding as either a list of floats or binary buffer
- Return type:
List[float]
Examples
>>> embedding = await vectorizer.aembed("Hello world")
- async aembed_many(
- texts: List[str],
- dtype: str | VectorDataType = 'float32',
- **kwargs: Any,
Asynchronously generate vector embeddings for multiple texts efficiently.
- Parameters:
texts (List[str]) – List of texts to convert to vector embeddings
preprocess – Function to apply to each text before embedding
batch_size – Number of texts to process in each API call
as_buffer – Return embeddings as binary buffers instead of lists
skip_cache – Bypass the cache for this request
**kwargs (Any) – Additional model-specific parameters
dtype (str | VectorDataType)
**kwargs
- Returns:
List of vector embeddings in the same order as the input texts
- Return type:
List[List[float]]
Examples
>>> embeddings = await vectorizer.aembed_many(["Hello", "World"], batch_size=2)
- batchify(
- seq: list,
- size: int,
- preprocess: Callable | None = None,
Split a sequence into batches of specified size.
- Parameters:
seq (list) – Sequence to split into batches
size (int) – Batch size
preprocess (Callable | None) – Optional function to preprocess each item
- Yields:
Batches of the sequence
- embed(
- text: str,
- dtype: str | VectorDataType = 'float32',
- **kwargs: Any,
Generate a vector embedding for a text string.
- Parameters:
text (str) – The text to convert to a vector embedding
preprocess – Function to apply to the text before embedding
as_buffer – Return the embedding as a binary buffer instead of a list
skip_cache – Bypass the cache for this request
**kwargs (Any) – Additional model-specific parameters
dtype (str | VectorDataType)
**kwargs
- Returns:
The vector embedding as either a list of floats or binary buffer
- Return type:
List[float]
Examples
>>> embedding = vectorizer.embed("Hello world")
- embed_many(
- texts: List[str],
- dtype: str | VectorDataType = 'float32',
- **kwargs: Any,
Generate vector embeddings for multiple texts efficiently.
- Parameters:
texts (List[str]) – List of texts to convert to vector embeddings
preprocess – Function to apply to each text before embedding
batch_size – Number of texts to process in each API call
as_buffer – Return embeddings as binary buffers instead of lists
skip_cache – Bypass the cache for this request
**kwargs (Any) – Additional model-specific parameters
dtype (str | VectorDataType)
**kwargs
- Returns:
List of vector embeddings in the same order as the input texts
- Return type:
List[List[float]]
Examples
>>> embeddings = vectorizer.embed_many(["Hello", "World"], batch_size=2)
- encode(
- texts: str | List[str],
- dtype: str | VectorDataType,
- **kwargs: Any,
- Parameters:
texts (str | List[str])
dtype (str | VectorDataType)
kwargs (Any)
- Return type:
ndarray
- property type: str#
Return the type of vectorizer.