LlamaCppEmbeddings#

class langchain_community.embeddings.llamacpp.LlamaCppEmbeddings[source]#

Bases: BaseModel, Embeddings

llama.cpp embedding models.

To use, you should have the llama-cpp-python library installed, and provide the path to the Llama model as a named parameter to the constructor. Check out: abetlen/llama-cpp-python

Example

from langchain_community.embeddings import LlamaCppEmbeddings
llama = LlamaCppEmbeddings(model_path="/path/to/model.bin")

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

param device: str | None = None#

Device type to use and pass to the model

param f16_kv: bool = False#

Use half-precision for key/value cache.

param logits_all: bool = False#

Return logits for all tokens, not just the last token.

param model_path: str [Required]#
param n_batch: int | None = 512#

Number of tokens to process in parallel. Should be a number between 1 and n_ctx.

param n_ctx: int = 512#

Token context window.

param n_gpu_layers: int | None = None#

Number of layers to be loaded into gpu memory. Default None.

param n_parts: int = -1#

Number of parts to split the model into. If -1, the number of parts is automatically determined.

param n_threads: int | None = None#

Number of threads to use. If None, the number of threads is automatically determined.

param seed: int = -1#

Seed. If -1, a random seed is used.

param use_mlock: bool = False#

Force system to keep model in RAM.

param verbose: bool = True#

Print verbose output to stderr.

param vocab_only: bool = False#

Only load the vocabulary, no weights.

async aembed_documents(texts: list[str]) list[list[float]]#

Asynchronous Embed search docs.

Parameters:

texts (list[str]) – List of text to embed.

Returns:

List of embeddings.

Return type:

list[list[float]]

async aembed_query(text: str) list[float]#

Asynchronous Embed query text.

Parameters:

text (str) – Text to embed.

Returns:

Embedding.

Return type:

list[float]

embed_documents(texts: List[str]) List[List[float]][source]#

Embed a list of documents using the Llama model.

Parameters:

texts (List[str]) – The list of texts to embed.

Returns:

List of embeddings, one for each text.

Return type:

List[List[float]]

embed_query(text: str) List[float][source]#

Embed a query using the Llama model.

Parameters:

text (str) – The text to embed.

Returns:

Embeddings for the text.

Return type:

List[float]

Examples using LlamaCppEmbeddings