EmbeddingsFilter#
- class langchain.retrievers.document_compressors.embeddings_filter.EmbeddingsFilter[source]#
Bases:
BaseDocumentCompressor
Document compressor that uses embeddings to drop documents unrelated to the query.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- param embeddings: Embeddings [Required]#
Embeddings to use for embedding document contents and queries.
- param k: int | None = 20#
The number of relevant documents to return. Can be set to None, in which case similarity_threshold must be specified. Defaults to 20.
- param similarity_fn: Callable [Optional]#
Similarity function for comparing documents. Function expected to take as input two matrices (List[List[float]]) and return a matrix of scores where higher values indicate greater similarity.
- param similarity_threshold: float | None = None#
Threshold for determining when two documents are similar enough to be considered redundant. Defaults to None, must be specified if k is set to None.
- async acompress_documents(documents: Sequence[Document], query: str, callbacks: list[BaseCallbackHandler] | BaseCallbackManager | None = None) Sequence[Document] [source]#
Filter documents based on similarity of their embeddings to the query.
- Parameters:
documents (Sequence[Document])
query (str)
callbacks (list[BaseCallbackHandler] | BaseCallbackManager | None)
- Return type:
Sequence[Document]
- compress_documents(documents: Sequence[Document], query: str, callbacks: list[BaseCallbackHandler] | BaseCallbackManager | None = None) Sequence[Document] [source]#
Filter documents based on similarity of their embeddings to the query.
- Parameters:
documents (Sequence[Document])
query (str)
callbacks (list[BaseCallbackHandler] | BaseCallbackManager | None)
- Return type:
Sequence[Document]
Examples using EmbeddingsFilter